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ANALYTIC COMBINATORICS 


Analytic combinatorics aims to enable precise quantitative predictions of the proper- 
ties of large combinatorial structures. The theory has emerged over recent decades 
as essential both for the analysis of algorithms and for the study of scientific models 
in many disciplines, including probability theory, statistical physics, computational 
biology and information theory. With a careful combination of symbolic enumera- 
tion methods and complex analysis, drawing heavily on generating functions, results 
of sweeping generality emerge that can be applied in particular to fundamental struc- 
tures such as permutations, sequences, strings, walks, paths, trees, graphs and maps. 
This account is the definitive treatment of the topic. In order to make it self- 
contained, the authors give full coverage of the underlying mathematics and give a 
thorough treatment of both classical and modern applications of the theory. The text is 
complemented with exercises, examples, appendices and notes throughout the book to 
aid understanding. The book can be used as a reference for researchers, as a textbook 
for an advanced undergraduate or a graduate course on the subject, or for self-study. 


PHILIPPE FLAJOLET is Research Director of the Algorithms Project at INRIA Roc- 
quencourt. 


ROBERT SEDGEWICK is William O. Baker Professor of Computer Science at Prince- 
ton University. 
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Preface 


ANALYTIC COMBINATORICS aims at predicting precisely the properties of large 
structured combinatorial configurations, through an approach based extensively on 
analytic methods. Generating functions are the central objects of study of the theory. 

Analytic combinatorics starts from an exact enumerative description of combina- 
torial structures by means of generating functions: these make their first appearance as 
purely formal algebraic objects. Next, generating functions are interpreted as analytic 
objects, that is, as mappings of the complex plane into itself. Singularities determine 
a function’s coefficients in asymptotic form and lead to precise estimates for counting 
sequences. This chain of reasoning applies to a large number of problems of discrete 
mathematics relative to words, compositions, partitions, trees, permutations, graphs, 
mappings, planar configurations, and so on. A suitable adaptation of the methods also 
opens the way to the quantitative analysis of characteristic parameters of large random 
structures, via a perturbational approach. 


THE APPROACH to quantitative problems of discrete mathematics provided by 
analytic combinatorics can be viewed as an operational calculus for combinatorics 
organized around three components. 


Symbolic methods develops systematic relations between some of the major 
constructions of discrete mathematics and operations on generating func- 
tions that exactly encode counting sequences. 

Complex asymptotics elaborates a collection of methods by which one can 
extract asymptotic counting information from generating functions, once 
these are viewed as analytic transformations of the complex domain. Singu- 
larities then appear to be a key determinant of asymptotic behaviour. 
Random structures concerns itself with probabilistic properties of large ran- 
dom structures. Which properties hold with high probability? Which laws 
govern randomness in large objects? In the context of analytic combina- 
torics, these questions are treated by a deformation (adding auxiliary vari- 
ables) and a perturbation (examining the effect of small variations of such 
auxiliary variables) of the standard enumerative theory. 


The present book expounds this view by means of a very large number of examples 
concerning classical objects of discrete mathematics and combinatorics. The eventual 
goal is an effective way of quantifying metric properties of large random structures. 


vii 
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Given its capacity of quantifying properties of large discrete structures, Analytic 
Combinatorics is susceptible to many applications, not only within combinatorics it- 
self, but, perhaps more importantly, within other areas of science where discrete prob- 
abilistic models recurrently surface, like statistical physics, computational biology, 
electrical engineering, and information theory. Last but not least, the analysis of al- 
gorithms and data structures in computer science has served and still serves as an 
important incentive for the development of the theory. 


KKKK KK 


Part A: Symbolic methods. This part specifically develops Symbolic methods, which 
constitute a unified algebraic theory dedicated to setting up functional relations be- 
tween counting generating functions. As it turns out, a collection of general (and 
simple) theorems provide a systematic translation mechanism between combinatorial 
constructions and operations on generating functions. This translation process is a 
purely formal one. In fact, with regard to basic counting, two parallel frameworks 
coexist—one for unlabelled structures and ordinary generating functions, the other 
for labelled structures and exponential generating functions. Furthermore, within the 
theory, parameters of combinatorial configurations can be easily taken into account 
by adding supplementary variables. Three chapters then form Part A: Chapter I deals 
with unlabelled objects; Chapter IT develops labelled objects in a parallel way; Chap- 
ter III treats multivariate aspects of the theory suitable for the analysis of parameters 
of combinatorial structures. 


KKKKKK 


Part B: Complex asymptotics. This part specifically expounds Complex asymptotics, 
which is a unified analytic theory dedicated to the process of extracting asymptotic in- 
formation from counting generating functions. A collection of general (and simple) 
theorems now provide a systematic translation mechanism between generating func- 
tions and asymptotic forms of coefficients. Five chapters form this part. Chapter IV 
serves as an introduction to complex-analytic methods and proceeds with the treatment 
of meromorphic functions, that is, functions whose singularities are poles, rational 
functions being the simplest case. Chapter V develops applications of rational and 
meromorphic asymptotics of generating functions, with numerous applications related 
to words and languages, walks and graphs, as well as permutations. Chapter VI devel- 
ops a general theory of singularity analysis that applies to a wide variety of singular- 
ity types, such as square-root or logarithmic, and has consequences regarding trees as 
well as other recursively-defined combinatorial classes. Chapter VII presents appli- 
cations of singularity analysis to 2-regular graphs and polynomials, trees of various 
sorts, mappings, context-free languages, walks, and maps. It contains in particular a 
discussion of the analysis of coefficients of algebraic functions. Chapter VIII explores 
saddle-point methods, which are instrumental in analysing functions with a violent 
growth at a singularity, as well as many functions with a singularity only at infinity 
(i.e., entire functions). 


KKKKKK 
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Part C: Random structures. This part is comprised of Chapter IX, which is dedi- 
cated to the analysis of multivariate generating functions viewed as deformation and 
perturbation of simple (univariate) functions. Many known laws of probability theory, 
either discrete or continuous, from Poisson to Gaussian and stable distributions, are 
found to arise in combinatorics, by a process combining symbolic methods, complex 
asymptotics, and perturbation methods. As a consequence, many important character- 
istics of classical combinatorial structures can be precisely quantified in distribution. 


KKKK KK 


Part D: Appendices. Appendix A summarizes some key elementary concepts of 
combinatorics and asymptotics, with entries relative to asymptotic expansions, lan- 
guages, and trees, among others. Appendix B recapitulates the necessary background 
in complex analysis. It may be viewed as a self-contained minicourse on the subject, 
with entries relative to analytic functions, the Gamma function, the implicit function 
theorem, and Mellin transforms. Appendix C recalls some of the basic notions of 
probability theory that are useful in analytic combinatorics. 


KKKKKK 


THIS BOOK is meant to be reader-friendly. Each major method is abundantly il- 
lustrated by means of concrete Examples! treated in detail—there are scores of them, 
spanning from a fraction of a page to several pages—offering a complete treatment of 
a specific problem. These are borrowed not only from combinatorics itself but also 
from neighbouring areas of science. With a view to addressing not only mathemati- 
cians of varied profiles but also scientists of other disciplines, Analytic Combinatorics 
is self-contained, including ample appendices that recapitulate the necessary back- 
ground in combinatorics, complex function theory, and probability. A rich set of short 
Notes—there are more than 450 of them—are inserted in the text? and can provide 
exercises meant for self-study or for student practice, as well as introductions to the 
vast body of literature that is available. We have also made every effort to focus on 
core ideas rather than technical details, supposing a certain amount of mathematical 
maturity but only basic prerequisites on the part of our gentle readers. The book is 
also meant to be strongly problem-oriented, and indeed it can be regarded as a man- 
ual, or even a huge algorithm, guiding the reader to the solution of a very large variety 
of problems regarding discrete mathematical models of varied origins. In this spirit, 
many of our developments connect nicely with computer algebra and symbolic ma- 
nipulation systems. 


COURSES can be (and indeed have been) based on the book in various ways. 
Chapters I-III on Symbolic methods serve as a systematic yet accessible introduc- 
tion to the formal side of combinatorial enumeration. As such it organizes trans- 
parently some of the rich material found in treatises* such as those of Bergeron— 
Labelle—Leroux, Comtet, Goulden—Jackson, and Stanley. Chapters [V—VIII relative to 
Complex asymptotics provide a large set of concrete examples illustrating the power 


‘Examples are marked by “Example... MI”. 
2Notes are indicated by D--- <. 
3References are to be found in the bibliography section at the end of the book. 


x PREFACE 


of classical complex analysis and of asymptotic analysis outside of their traditional 
range of applications. This material can thus be used in courses of either pure or 
applied mathematics, providing a wealth of non-classical examples. In addition, the 
quiet but ubiquitous presence of symbolic manipulation systems provides a number of 
illustrations of the power of these systems while making it possible to test and con- 
cretely experiment with a great many combinatorial models. Symbolic systems allow 
for instance for fast random generation, close examination of non-asymptotic regimes, 
efficient experimentation with analytic expansions and singularities, and so on. 

Our initial motivation when starting this project was to build a coherent set of 
methods useful in the analysis of algorithms, a domain of computer science now well- 
developed and presented in books by Knuth, Hofri, Mahmoud, and Szpankowski, in 
the survey by Vitter—Flajolet, as well as in our earlier Introduction to the Analysis of 
Algorithms published in 1996. This book, Analytic Combinatorics, can then be used 
as a systematic presentation of methods that have proved immensely useful in this 
area; see in particular the Art of Computer Programming by Knuth for background. 
Studies in statistical physics (van Rensburg, and others), statistics (e.g., David and 
Barton) and probability theory (e.g., Billingsley, Feller), mathematical logic (Burris’ 
book), analytic number theory (e.g., Tenenbaum), computational biology (Waterman’s 
textbook), as well as information theory (e.g., the books by Cover-Thomas, MacKay, 
and Szpankowski) point to many startling connections with yet other areas of science. 
The book may thus be useful as a supplementary reference on methods and applica- 
tions in courses on statistics, probability theory, statistical physics, finite model the- 
ory, analytic number theory, information theory, computer algebra, complex analysis, 
or analysis of algorithms. 
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— PLATO, The Timaeus! 


ANALYTIC COMBINATORICS is primarily a book about combinatorics, that is, the 
study of finite structures built according to a finite set of rules. Analytic in the title 
means that we concern ourselves with methods from mathematical analysis, in par- 
ticular complex and asymptotic analysis. The two fields, combinatorial enumeration 
and complex analysis, are organized into a coherent set of methods for the first time 
in this book. Our broad objective is to discover how the continuous may help us to 
understand the discrete and to quantify its properties. 


COMBINATORICS is, as told by its name, the science of combinations. Given 
basic rules for assembling simple components, what are the properties of the resulting 
objects? Here, our goal is to develop methods dedicated to quantitative properties 
of combinatorial structures. In other words, we want to measure things. Say that 
we have n different items like cards or balls of different colours. In how many ways 
can we lay them on a table, all in one row? You certainly recognize this counting 
problem—finding the number of permutations of n elements. The answer is of course 
the factorial number 

n!=1-2-...-n. 


This is a good start, and, equipped with patience or a calculator, we soon determine 
that if n = 31, say, then the number of permutations is the rather large quantity 


31! = 8222838654177922817725562880000000, . 


an integer with 34 decimal digits. The factorials solve an enumeration problem, one 
that took mankind some time to sort out, because the sense of the “- - -”’ in the formula 
for n! is not that easily grasped. In his book The Art of Computer Programming 


1 “So their combinations with themselves and with each other give rise to endless complexities, which 
anyone who is to give a likely account of reality must survey.” Plato speaks of Platonic solids viewed as 
idealized primary constituents of the physical universe. 
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Figure 0.1. An example of the correspondence between an alternating permutation 
(top) and a decreasing binary tree (bottom): each binary node has two descendants, 
which bear smaller labels. Such constructions, which give access to generating func- 
tions and eventually provide solutions to counting problems, are the main subject of 
Part A. 


(vol II, p. 23), Donald Knuth traces the discovery to the Hebrew Book of Creation 
(c. AD 400) and the Indian classic Anuyogadvara-sutra (c. AD 500). 

Here is another more subtle problem. Assume that you are interested in permuta- 
tions such that the first element is smaller than the second, the second is larger than the 
third, itself smaller than the fourth, and so on. The permutations go up and down and 
they are diversely known as up-and-down or zigzag permutations, the more dignified 
name being alternating permutations. Say that n = 2m + 1 is odd. An example is for 
n=9: 

8 7 9 3 


CAS a Am (RE SS 4 
4 6 5 1 2 


The number of alternating permutations for n = 1,3,5,..., 15 turns out to be 
1, 2, 16, 272, 7936, 353792, 22368256, 1903757312. 


What are these numbers and how do they relate to the total number of permutations of 
corresponding size? A glance at the corresponding figures, that is, 1!, 3!,5!,..., 15!, 
or 

1, 6, 120, 5040, 362880, 39916800, 6227020800, 1307674368000, 


suggests that the factorials grow somewhat faster—just compare the lengths of the last 
two displayed lines. But how and by how much? This is the prototypical question we 
are addressing in this book. 

Let us now examine the counting of alternating permutations. In 1881, the French 
mathematician Désiré André made a startling discovery. Look at the first terms of the 
Taylor expansion of the trigonometric function tan z: 


3 5 7 9 ll 
Zz z z z Z z 
t =1 2 16 272 7936— + 353792 +.---. 
egy gp eal gy ogy, Tm 
The counting sequence for alternating permutations, 1,2, 16, ..., curiously surfaces. 


We say that the function on the left is a generating function for the numerical se- 
quence (precisely, a generating function of the exponential type, due to the presence 
of factorials in the denominators). 
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André’s derivation may nowadays be viewed very simply as reflecting the con- 
struction of permutations by means of certain labelled binary trees (Figure 0.1 and 
p. 143): given a permutation o a tree can be obtained once o has been decomposed as 
a triple (o,,, max, or), by taking the maximum element as the root, and appending, as 
left and right subtrees, the trees recursively constructed from oy and op. Part A of this 
book develops at length symbolic methods by which the construction of the class T of 
all such trees, 


T = @ U (J,max,T), 


translates into an equation relating generating functions, 
: 2 
i ae | T (w)* dw. 
0 


In this equation, T(z) := >2,, Tnz"/n! is the exponential generating function of the 
sequence (T,,), where T,, is the number of alternating permutations of (odd) length n. 
There is a compelling formal analogy between the combinatorial specification and 
its generating function: Unions (U) give rise to sums (+), max-placement gives an 
integral (|), forming a pair of trees corresponds to taking a square (EP). 

At this stage, we know that T(z) must solve the differential equation 


{T@=14+T@, T)=0, 
dz 


which, by classical manipulations, yields the explicit form 
T(z) = tanz. 


The generating function then provides a simple algorithm to compute the coefficients 
recurrently. Indeed, the formula, 


implies, for n odd, the relation (extract the coefficient of z” in T(z) cos z = sin z) 


if” a setts sat antes ay at 
Th (5) t-2 = () ts = ( 1) ’ where b = b\(a = b)! 


is the conventional notation for binomial coefficients. Now, the exact enumeration 
problem may be regarded as solved since a very simple algorithm is available for 
determining the counting sequence, while the generating function admits an explicit 
expression in terms of well-known mathematical objects. 


ANALYSIS, by which we mean mathematical analysis, is often described as the 
art and science of approximation. How fast do the factorial and the tangent number 
sequences grow? What about comparing their growths? These are typical problems 
of analysis. 


2We have T’/A+ T?) = I, hence arctan(T) = z and T = tan z. 


4 AN INVITATION TO ANALYTIC COMBINATORICS 


Figure 0.2. Two views of the function z + tanz. Left: a plot for real values of z € 
[—6, 6]. Right: the modulus | tan z| when z = x + iy (with i = /—1) is assigned 
complex values in the square +6 + 67. As developed at length in Part B, it is the 
nature of singularities in the complex domain that matters. 


First, consider the number of permutations, n!. Quantifying its growth, as n gets 
large, takes us to the realm of asymptotic analysis. The way to express factorial num- 
bers in terms of elementary functions is known as Stirling’s formula? 


niw~n'e "J/2rn, 


where the ~ sign means “approximately equal” (in the precise sense that the ratio of 
both terms tends to | as n gets large). This beautiful formula, associated with the 
name of the Scottish mathematician James Stirling (1692-1770), curiously involves 
both the basis e of natural logarithms and the perimeter 27 of the circle. Certainly, 
you cannot get such a thing without analysis. As a first step, there is an estimate 


n n 
logn! = log J sof logx dx ~ nlog (=) : 
j=l 


explaining at least the n”e~” term, but already requiring a certain amount of elemen- 
tary calculus. (Stirling’s formula precisely came a few decades after the fundamental 
bases of calculus had been laid by Newton and Leibniz.) Note the utility of Stirling’s 
formula: it tells us almost instantly that 100! has 158 digits, while 1000! borders the 
astronomical 107°. 

We are now left with estimating the growth of the sequence of tangent numbers, 
T,. The analysis leading to the derivation of the generating function tan(z) has been 
so far essentially algebraic or “formal”. Well, we can plot the graph of the tangent 
function, for real values of its argument and see that the function becomes infinite at 
the points +4, +34, and so on (Figure 0.2). Such points where a function ceases to be 


3In this book, we shall encounter five different proofs of Stirling’s formula, each of interest for its 
own sake: (i) by singularity analysis of the Cayley tree function (p. 407); (ii) by singularity analysis of 
polylogarithms (p. 410); (iii) by the saddle-point method (p. 555); (iv) by Laplace’s method (p. 760); 
(v) by the Mellin transform method applied to the logarithm of the Gamma function (p. 766). 
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“smooth” (differentiable) are called singularities. By methods amply developed in this 
book, it is the local nature of a generating function at its “dominant” singularities (i.e., 
the ones closest to the origin) that determines the asymptotic growth of the sequence of 
coefficients. From this perspective, the basic fact that tan z has dominant singularities 
at +4 enables us to reason as follows: first approximate the generating function tan z 
near its two dominant singularities, namely, 


8z 
tan(z ipa 4 
( aie m2 — 472 


then extract coefficients of this approximation; finally, get in this way a valid approx- 


imation of coefficients: 
Lee (= 
n! n->0o 1 


n+l 
) (n odd). 


With present day technology, we also have available symbolic manipulation sys- 
tems (also called “computer algebra” systems) and it is not difficult to verify the ac- 


curacy of our estimates. Here is a small pyramid for n = 3,5,..., 21, 
2\1 
16) 15 
272 | 271 
7936 | 7935 
353792 | 353791 
22368256 | 2236825 1 
1903757312 | 1903757 267 
209865342976 | 20986534 2434 
29088885 112832 | 290888851 04489 
4951498053124096 | 495149805 2966307 
Tn) (Ty) 


comparing the exact values of T, against the approximations 7,*, where (n odd) 


2 n+1 
Te i fam (-) i 
a 


and discrepant digits of the approximation are displayed in bold. For n = 21, the error 
is only of the order of one in a billion. Asymptotic analysis (p. 269) is in this case 
wonderfully accurate. 

In the foregoing discussion, we have played down a fact—one that is important. 
When investigating generating functions from an analytic standpoint, one should gen- 
erally assign complex values to arguments not just real ones. It is singularities in the 
complex plane that matter and complex analysis is needed in drawing conclusions re- 
garding the asymptotic form of coefficients of a generating function. Thus, a large 
portion of this book relies on a complex analysis technology, which starts to be de- 
veloped in Part B dedicated to Complex asymptotics. This approach to combinatorial 
enumeration parallels what happened in the nineteenth century, when Riemann first 
recognized the deep relation between complex analytic properties of the zeta function, 
¢(s) := >°1/n*, and the distribution of primes, eventually leading to the long-sought 
proof of the Prime Number Theorem by Hadamard and de la Vallée-Poussin in 1896. 
Fortunately, relatively elementary complex analysis suffices for our purposes, and we 
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Rb aRS 


Figure 0.3. The collection of binary trees with n = 0, 1,2,3 binary nodes, with 
respective cardinalities 1, 1, 2,5. 


can include in this book a complete treatment of the fragment of the theory needed to 
develop the fundamentals of analytic combinatorics. 


Here is yet another example illustrating the close interplay between combina- 
torics and analysis. When discussing alternating permutations, we have enumerated 
binary trees bearing distinct integer labels that satisfy a constraint—to decrease along 
branches. What about the simpler problem of determining the number of possible 
shapes of binary trees? Let C, be the number of binary trees that have n binary 
branching nodes, hence n + 1 “external nodes”. It is not hard to come up with an 
exhaustive listing for small values of n (Figure 0.3), from which we determine that 


Co=1l, Cr=1, Co=2, C3=5, Cya=14, Cs = 42. 


These numbers are probably the most famous ones of combinatorics. They have come 
to be known as the Catalan numbers as a tribute to the Franco-Belgian mathemati- 
cian Eugéne Charles Catalan (1814-1894), but they already appear in the works of 
Euler and Segner in the second half of the eighteenth century (see p. 20). In his refer- 
ence treatise Enumerative Combinatorics, Stanley, over 20 pages, lists a collection of 
some 66 different types of combinatorial structures that are enumerated by the Catalan 
numbers. 

First, one can write a combinatorial equation, very much in the style of what has 
been done earlier, but without labels: 


C= i ewe) 


(Here, the D-symbol represents an external node.) With symbolic methods, it is easy 
to see that the ordinary generating function of the Catalan numbers, defined as 


C@) = Do Cuz", 
n>0 


satisfies an equation that is a direct reflection of the combinatorial definition, namely, 
Ce) = Lt + 2Cey. 
This is a quadratic equation whose solution is 


1-/1—4z 


os ae 2z 
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Figure 0.4. Left: the real values of the Catalan generating function, which has a 
square-root singularity at z = i Right: the ratio Cy/ (4"n-3/ 2) plotted together 
with its asymptote at 1/,/z = 0.56418. The correspondence between singularities 
and asymptotic forms of coefficients is the central theme of Part B. 


Then, by means of Newton’s theorem relative to the expansion of (1 + x)%, one finds 
easily (x = —4z,a = 5) the closed form expression 


C= 1 2n 
" n+ihanly 
Stirling’s asymptotic formula now comes to the rescue: it implies 


4n 


mn? 


Cn ~ Cr where Cy := 


This last approximation is quite usable’: it gives C } = 2.25 (whereas C; = 1), which 
is off by a factor of 2, but the error drops to 10% already for n = 10, and it appears to 
be less than 1% for any n > 100. 

A plot of the generating function C(z) in Figure 0.4 illustrates the fact that C (z) 
has a singularity at z = i as it ceases to be differentiable (its derivative becomes infi- 
nite). That singularity is quite different from a pole and for natural reasons it is known 
as a square-root singularity. As we shall see repeatedly, under suitable conditions 
in the complex plane, a square root singularity for a function at a point p invariably 
entails an asymptotic form p~"n~?/? for its coefficients. More generally, it suffices 
to estimate a generating function near a singularity in order to deduce an asymptotic 
approximation of its coefficients. This correspondence is a major theme of the book, 
one that motivates the five central chapters (Chapters IV to VII). 

A consequence of the complex analytic vision of combinatorics is the detection of 
universality phenomena in large random structures. (The term is originally borrowed 
from statistical physics and is nowadays finding increasing use in areas of mathema- 
tics such as probability theory.) By universality is meant here that many quantitative 


4We use a = dto represent a numerical approximation of the real a by the decimal d, with the last 
digit of d being at most +1 from its actual value. 
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properties of combinatorial structures only depend on a few global features of their 
definitions, not on details. For instance a growth in the counting sequence of the form 


KAM 73/2, 


arising from a square-root singularity, will be shown to be universal across all varieties 
of trees determined by a finite set of allowed node degrees—this includes unary— 
binary trees, ternary trees, 0-11-13 trees, as well as many variations such as non-plane 
trees and labelled trees. Even though generating functions may become arbitrarily 
complicated—as in an algebraic function of a very high degree or even the solution to 
an infinite functional equation—it is still possible to extract with relative ease global 
asymptotic laws governing counting sequences. 


RANDOMNESS is another ingredient in our story. How useful is it to determine, 
exactly or approximately, counts that may be so large as to require hundreds if not 
thousands of digits in order to be written down? Take again the example of alter- 
nating permutations. When estimating their number, we have indeed quantified the 
proportion of these among all permutations. In other words, we have been predicting 
the probability that a random permutation of some size n is alternating. Results of 
this sort are of interest in all branches of science. For instance, biologists routinely 
deal with genomic sequences of length 10°, and the interpretation of data requires de- 
veloping enumerative or probabilistic models where the number of possibilities is of 
the order of 4!°°, The language of probability theory then proves of great convenience 
when discussing characteristic parameters of discrete structures, since we can interpret 
exact or asymptotic enumeration results as saying something concrete about the like- 
lihood of values that such parameters assume. Equally important of course are results 
from several areas of probability theory: as demonstrated in the last chapter of this 
book, such results merge extremely well with the analytic-combinatorial framework. 

Say we are now interested in runs in permutations. These are the longest frag- 
ments of a permutation that already appear in (increasing) sorted order. Here is a 
permutation with 4 runs, separated by vertical bars: 


258|/39/147]6. 


Runs naturally present in a permutation are for instance exploited by a sorting algo- 
rithm called “natural list mergesort”, which builds longer and longer runs, starting 
from the original ones and merging them until the permutation is eventually sorted. 
For our understanding of this algorithm, it is then of obvious interest to quantify how 
many runs a permutation is likely to have. 

Let P,,,; be the number of permutations of size n having k runs. Then, the problem 
is once more best approached by generating functions and one finds that the coefficient 
of u*z" inside the bivariate generating function, 

1l- 2 3 


u Zz x 
P@,u= 5 =1t c+ Fut + Fue +4ut te, 


— wer) 
gives the desired numbers P,,/n!. (A simple way of establishing the last formula 


bases itself on the tree decomposition of permutations and on the symbolic method; 
the numbers P,~, whose importance seems to have been first recognized by Euler, 
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Figure 0.5. Left: A partial plot of the real values of the Eulerian generating function 
zh’ P(z,u) for z € [0, 1, illustrates the presence of a movable pole for A as u 
varies between 0 and 3. Right: A suitable superposition of the histograms of the 
distribution of the number of runs, for n = 2,..., 60, reveals the convergence to a 
Gaussian distribution (p. 695). Part C relates systematically the analysis of such a 
collection of singular behaviours to limit distributions. 


are related to the Eulerian numbers, p. 210.) From here, we can easily determine 
effectively the mean, variance, and even the higher moments of the number of runs 
that a random permutation has: it suffices to expand blindly, or even better with the 
help of a computer, the bivariate generating function above as u > 1: 


1 lg@=2)y. | 1 22 (6-—4z+4 27) 
iar Gee? 45 (d—zy 


G1 fees, 


When u = 1, we just enumerate all permutations: this is the constant term 1/(1 — z) 
equal to the exponential generating function of all permutations. The coefficient of 
the term u — | gives the generating function of the mean number of runs, the next one 
provides the second moment, and so on. In this way, we discover the expectation and 
standard deviation of the number of runs in a permutation of size n: 


n+1 n+1 

= =3 On = a 

Then, by easy analytic—probabilistic inequalities (Chebyshev inequalities) that other- 
wise form the basis of what is known as the second moment method, we learn that the 
distribution of the number of runs is concentrated around its mean: in all likelihood, 
if one takes a random permutation, the number of its runs is going to be very close to 
its mean. The effects of such quantitative laws are quite tangible. It suffices to draw a 
sample of one element for n = 30 to get, for instance: 


13, 22, 29]12, 15, 23]8, 28]18]6, 26/4, 10, 16]1, 5, 27]3, 14, 17, 20]2, 21, 30/25] 11, 19|9|7, 24. 


For n = 30, the mean is 155, and this sample comes rather close as it has 13 runs. 
We shall furthermore see in Chapter [IX that even for moderately large permutations 
of size 10000 and beyond, the probability for the number of observed runs to deviate 
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Figure 0.6. Left: The bivariate generating function z +» C(z, u) enumerating binary 
trees by size and number of leaves exhibits consistently a square-root singularity, for 
several values of u. Right: a binary tree of size 300 drawn uniformly at random has 
69 leaves. As shown in Part C, singularity perturbation properties are at the origin of 
many randomness properties of combinatorial structures. 


by more than 10% from the mean is less than 10~°°. As witnessed by this example, 
much regularity accompanies properties of large combinatorial structures. 

More refined methods combine the observation of singularities with analytic re- 
sults from probability theory (e.g., continuity theorems for characteristic functions). In 
the case of runs in permutations, the quantity P(z, w) viewed as a function of z when u 
is fixed appears to have a pole: this fact is suggested by Figure 0.5 [left]. Then we are 
confronted with a fairly regular deformation of the generating function of all permu- 
tations. A parameterized version (with parameter uw) of singularity analysis then gives 
access to a description of the asymptotic behaviour of the Eulerian numbers P,, ,. This 
enables us to describe very precisely what goes on: in a random permutation of large 
size n, once it has been centred by its mean and scaled by its standard deviation, the 
distribution of the number of runs is asymptotically Gaussian; see Figure 0.5 [right]. 

A somewhat similar type of situation prevails for binary trees. Say we are inter- 
ested in leaves (also sometimes figuratively known as “cherries’’) in trees: these are bi- 
nary nodes that are attached to two external nodes (Q). Let C;,,; be the number of trees 
of size n having k leaves. The bivariate generating function C(z, u) := >), x Cracu® 
encodes all the information relative to leaf statistics in random binary trees. A mod- 
ification of previously seen symbolic arguments shows that C(z, uv) still satisfies a 
quadratic equation resulting in the explicit form, 


1—- V1 —42 + 422(1 —u) 
2z 


This reduces to C(z) for u = 1, as it should, and the bivariate generating func- 
tion C(z, u) is a deformation of C(z) as u varies. In fact, the network of curves of 
Figure 0.6 for several fixed values of u illustrates the presence of a smoothly varying 
square-root singularity (the aspect of each curve is similar to that of Figure 0.4). It is 
possible to analyse the perturbation induced by varying values of u, to the effect that 


C(z,u) = 
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Combinatorial structures 
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Exact 
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Singularity analysis Multivariate asymptotics and limit laws 
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Saddle—point method 
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Asymptotic counting, moments of parameters Limit laws, large deviations 


Figure 0.7. The logical structure of Analytic Combinatorics. 


C(z, u) is of the global analytic type 


7G 
je. 
Vo p(u)’ 


for some analytic p(u). The already evoked process of singularity analysis then shows 
that the probability generating function of the number of leaves in a tree of size n is of 


the rough form 
(2 (1) ) 
——] (1+0()). 
pu) 

This is known as a “quasi-powers” approximation. It resembles very much the 
probability generating function of a sum of n independent random variables, a sit- 
uation that gives rise to the classical Central Limit Theorem of probability theory. 
Accordingly, one gets that the limit distribution of the number of leaves in a large 
random binary tree is Gaussian. In abstract terms, the deformation induced by the 
secondary parameter (here, the number of leaves, previously, the number of runs) is 
susceptible to a perturbation analysis, to the effect that a singularity gets smoothly 
displaced without changing its nature (here, a square root singularity, earlier a pole) 
and a limit law systematically results. Again some of the conclusions can be verified 
even by very small samples: the single tree of size 300 drawn at random and dis- 
played in Figure 0.6 (right) has 69 leaves, whereas the expected value of this number 
is = 75.375 and the standard deviation is a little over 4. In a large number of cases of 
which this one is typical, we find metric laws of combinatorial structures that govern 
large structures with high probability and eventually make them highly predictable. 

Such randomness properties form the subject of Part C of this book dedicated to 
random structures. As our earlier description implies, there is an extreme degree of 
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generality in this analytic approach to combinatorial parameters, and after reading this 
book, the reader will be able to recognize by herself dozens of such cases at sight, and 
effortlessly establish the corresponding theorems. 


A RATHER ABSTRACT VIEW of combinatorics emerges from the previous discus- 
sion; see Figure 0.7. A combinatorial class, as regards its enumerative properties, can 
be viewed as a surface in four-dimensional real space: this is the graph of its gener- 
ating function, considered as a function from the set C = R* of complex numbers to 
itself, and is otherwise known as a Riemann surface. This surface has “cracks’’, that is, 
singularities, which determine the asymptotic behaviour of the counting sequence. A 
combinatorial construction (such as those freely forming sequences, sets, and so on) 
can then be examined through the effect it has on singularities. In this way, seemingly 
different types of combinatorial structures appear to be subject to common laws gov- 
erning not only counting but also finer characteristics of combinatorial structures. For 
the already discussed case of universality in tree enumerations, additional universal 
laws valid across many tree varieties constrain for instance height (which, with high 
probability, is proportional to the square root of size) and the number of leaves (which 
is invariably normal in the asymptotic limit). 

What happens regarding probabilistic properties of combinatorial parameters is 
this. A parameter of a combinatorial class is fully determined by a bivariate generating 
function, which is a deformation of the basic counting generating function of the class 
(in the sense that setting the secondary variable u to | erases the information relative 
to the parameter and leads back to the univariate counting generating function). Then, 
the asymptotic distribution of a parameter of interest is characterized by a collection 
of surfaces, each having its own singularities. The way the singularities’ locations 
move or their nature changes under deformation encodes all the necessary informa- 
tion regarding the distribution of the parameter under consideration. Limit laws for 
combinatorial parameters can then be obtained and the corresponding phenomena can 
be organized into broad categories, called schemas. It would be inconceivable to attain 
such a far-reaching classification of metric properties of combinatorial structures by 
elementary real analysis alone. 

Objects on which we are going to inflict the treatments just described include 
many of the most important ones of discrete mathematics, as well as the ones that sur- 
face recurrently in several branches of the applied sciences. We shall thus encounter 
words and sequences, trees and lattice paths, graphs of various sorts, mappings, al- 
locations, permutations, integer partitions and compositions, polyominoes and pla- 
nar maps, to name but a few. In most cases, their principal characteristics will be 
finely quantified by the methods of analytic combinatorics. This book indeed devel- 
ops a coherent theory of random combinatorial structures based on a powerful analytic 
methodology. Literally dozens of quite diverse combinatorial types can then be treated 
by a logically transparent chain. You will not find ready-made answers to all questions 
in this book, but, hopefully, methods that can be successfully used to address a great 
many of them. 


Bienvenue! Welcome! 


Part A 


SYMBOLIC METHODS 


Combinatorial Structures and 
Ordinary Generating Functions 


Laplace discovered the remarkable correspondence between 
set theoretic operations and operations on formal power series 
and put it to great use to solve a variety of combinatorial problems. 


— GIAN-CARLO Rota [518] 
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This chapter and the next are devoted to enumeration, where the problem is to deter- 
mine the number of combinatorial configurations described by finite rules, and do so 
for all possible sizes. For instance, how many different words are there of length 17? 
Of length n, for general n? These questions are easy, but what if some constraints 
are imposed, e.g., no four identical elements in a row? The solutions are exactly 
encoded by generating functions, and, as we shall see, generating functions are the 
central mathematical object of combinatorial analysis. We examine here a framework 
that, contrary to traditional treatments based on recurrences, explains the surprising 
efficiency of generating functions in the solution of combinatorial enumeration prob- 
lems. 

This chapter serves to introduce the symbolic approach to combinatorial enumer- 
ations. The principle is that many general set-theoretic constructions admit a direct 
translation as operations over generating functions. This principle is made concrete by 
means of a dictionary that includes a collection of core constructions, namely the op- 
erations of union, cartesian product, sequence, set, multiset, and cycle. Supplementary 
operations such as pointing and substitution can also be similarly translated. In this 
way, a language describing elementary combinatorial classes is defined. The problem 
of enumerating a class of combinatorial structures then simply reduces to finding a 
proper specification, a sort of computer program for the class expressed in terms of 
the basic constructions. The translation into generating functions becomes, after this, 
a purely mechanical symbolic process. 

We show here how to describe in such a context integer partitions and compo- 
sitions, as well as many word and tree enumeration problems, by means of ordinary 
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generating functions. A parallel approach, developed in Chapter II, applies to labelled 
objects—in contrast the plain structures considered in this chapter are called unla- 
belled. The methodology is susceptible to multivariate extensions with which many 
characteristic parameters of combinatorial objects can also be analysed in a unified 
manner: this is to be examined in Chapter III. The symbolic method also has the great 
merit of connecting nicely with complex asymptotic methods that exploit analyticity 
properties and singularities, to the effect that precise asymptotic estimates are usually 
available whenever the symbolic method applies—a systematic treatment of these as- 
pects forms the basis of Part B of this book Complex asymptotics (Chapters [V—VIID. 


I. 1. Symbolic enumeration methods 


First and foremost, combinatorics deals with discrete objects, that is, objects that 
can be finitely described by construction rules. Examples are words, trees, graphs, 
permutations, allocations, functions from a finite set into itself, topological configu- 
rations, and so on. A major question is to enumerate such objects according to some 
characteristic parameter(s). 


Definition I.1. A combinatorial class, or simply a class, is a finite or denumerable set 
on which a size function is defined, satisfying the following conditions: 


(i) the size of an element is a non-negative integer; 
ii) the number of elements of any given size is finite. 
YS 


If A is aclass, the size of an element a € A is denoted by |a|, or |a|_4 in the few cases 
where the underlying class needs to be made explicit. Given a class A, we consistently 
denote by .A,, the set of objects in A that have size n and use the same group of letters 
for the counts A, = card(A,) (alternatively, also a, = card(A,)). An axiomatic 
presentation is then as follows: a combinatorial class is a pair (A, | - |) where A is at 
most denumerable and the mapping | - | € (A+ Zso) is such that the inverse image 
of any integer is finite. 


Definition 1.2. The counting sequence of a combinatorial class is the sequence of 
integers (An)n>o0 where Ay = card(A,) is the number of objects in class A that have 
size n. 


Example 1.1. Binary words. Consider first the set W of binary words, which are sequences of 
elements taken from the binary alphabet A = {0,1}, 


W := {e, 0, 1, 00, 01, 10, 11, 000, 001, 010,..., 1001101,...}, 


with ¢ the empty word. Define size to be the number of letters that a word comprises. There are 
two possibilities for each letter and possibilities multiply, so that the counting sequence (W,,) 
satisfies 


Wn = 2". 
(This sequence has a well-known legend associated with the invention of the game of chess: the 
inventor was promised by his king one grain of rice for the first square of the chessboard, two 


for the second, four for the third, and so on. The king naturally could not deliver the promised 
ge a STAINS) My veved aie god dawns Seen as. Danes leeks d mar nea aee Leelee kee Lage | 
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Figure I.1. The collection T of all triangulations of regular polygons (with size de- 
fined as the number of triangles) is a combinatorial class, whose counting sequence 
starts as Tg = 1, Tj) = 1, Tn = 2, T3 =5, Ty = 14, T5 = 42. 


Example 1.2. Permutations. A permutation of size n is by definition a bijective mapping of the 
integer interval! Z, := [1 ..n]. It is thus representable by an array, 


( 1 2 n ) 
0, 902 +++ On J? 


or equivalently by the sequence 0102 --- on of its distinct elements. The set P of permutations 
is 
P ={..., 12,21, 123, 132, 213, 231, 312, 321, 1234, ..., 532614, ...}, 


For a permutation written as a sequence of n distinct numbers, there are n places where one can 
accommodate n, then n — | remaining places for n — 1, and so on. Therefore, the number Py, 
of permutations of size n satisfies 


Pys=nt!=1-2-...-n. 


As indicated in our Invitation chapter (p. 2), this formula has been known for at least fifteen 
CONMMUTIES: acehenkidastasieie daca deeaiadasae ares sweden’ aeaehened adegemndd ase tadss 1] 


Example 1.3. Triangulations. The class T of triangulations comprises triangulations of con- 
vex polygonal domains which are decompositions into non-overlapping triangles (taken up to 
smooth deformations of the plane). We define the size of a triangulation to be the number of tri- 
angles it is composed of. For instance, a convex quadrilateral ABC D can be decomposed into 
two triangles in two ways (by means of either the diagonal AC or the diagonal BD); similarly, 
there are five different ways to dissect a convex pentagon into three triangles: see Figure I.1. 
Agreeing that 7g = 1, we then find 


T=1, %W=1 h=2 %H=5, hW=14, 15 =42. 


It is a non-trivial combinatorial result due to Euler and Segner [146, 196, 197] around 1750 that 
the number 7;, of triangulations is 


1 (") (2n)! 
() Th = —— = ane 
n+1\n (n+1)!n! 


a central quantity of combinatorial analysis known as a Catalan number: see our Invitation, 
p. 7, the historical synopsis on p. 20, the discussion on p. 35, and Subsection I. 5.3, p. 73. 


!We borrow from computer science the convenient practice of denoting an integer interval by |...” or 
[1..n], whereas [0, n] represents a real interval. 


18 I, COMBINATORIAL STRUCTURES AND ORDINARY GENERATING FUNCTIONS 


Following Euler [196], the counting of triangulations is best approached by generating 
functions: see again Figure I.2, p. 20 for historical context. ............ 0.0... eee eee eee | 


Although the previous three examples are simple enough, it is generally a good 
idea, when confronted with a combinatorial enumeration problem, to determine the 
initial values of counting sequences, either by hand or better with the help of a com- 
puter, somehow. Here, we find: 


n 0 12 3 +4 5 6 7 8 9 10 
(2) Wh 1 2 4 8 16 32 64 128 256 512 1024 
Pn 1 1 2 6 24 120 720 5040 40320 362880 3628800 
Th 1 12 5 14 42 132 429 = 1430 4862 16796 


Such an experimental approach may greatly help identify sequences. For instance, 
had we not known the formula (1) for triangulations, observing unusual factorizations 
such as 


Ty? 252 7 11 23-4347 53. 59+61+-67 * 11.2 73+ 79: 


which contains all prime numbers from 43 to 79 and no prime larger than 80, would 
quickly put us on the track of the right formula. There even exists nowadays a huge 
On-line Encyclopedia of Integer Sequences (EIS) due to Sloane that is available in 
electronic form [543] (see also an earlier book by Sloane and Plouffe [544]) and con- 
tains more than 100 000 sequences. Indeed, the three sequences (W,,), (Pn), and (Tn) 
are respectively identified” as EIS A000079, EJS A000142, and EIS A000108. 


> I.1. Necklaces. How many different types of necklace designs can you form with n beads, 
each having one of two colours, o and e, where it is postulated that orientation matters? Here 
are the possibilities for n = 1, 2, 3, 


°# 000 OOOO 


This is equivalent to enumerating circular arrangements of two letters and an exhaustive listing 
program can be based on the smallest lexicographical representation of each word, as suggested 
by (20), p. 26. The counting sequence starts as 2,3, 4, 6, 8, 14, 20, 36, 60, 108, 188, 352 and 
constitutes EJS A000031. [An explicit formula appears later in this chapter (p. 64).] What if 
two necklace designs that are mirror images of one another are identified? dq 


> 1.2. Unimodal permutations. Such a permutation has exactly one local maximum. In other 
words it is of the form o1 ---o, witha, < 02 <-++ < o, =nandox, =n > og4) > +++ > On, 
for some k > 1. How many such permutations are there of size n? For n = 5, the number is 16: 
the permutations are 12345, 12354, 12453, 12543, 13452, 13542, 14532 and 15432 and their 
reversals. [Due to Jon Perry, see EJS A000079.] <i 

It is also of interest to note that words and permutations may be enumerated using 


the most elementary counting principles, namely, for finite sets 6 and C 


card(BUC) = card(B) + card(C) (provided BNC = @) 
card(B x C) = card(B)-card(C). 


(3) 


2 Throughout this book, a reference such as EJS Axxx points to Sloane’s Encyclopedia of Integer 
Sequences [543]. The database contains more than 100 000 entries. 
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We shall see soon that these principles, which lie at the basis of our very concept of 
number, admit a powerful generalization (Equation (19), p. 23, below). 

Next, for combinatorial enumeration purposes, it proves convenient to identify 
combinatorial classes that are merely variants of one another. 


Definition 1.3. Two combinatorial classes A and B are said to be (combinatorially) 
isomorphic, which is written A = B, iff their counting sequences are identical. This 
condition is equivalent to the existence of a bijection from A to B that preserves size, 
and one also says that A and B are bijectively equivalent. 


We normally identify isomorphic classes and accordingly employ a plain equality 
sign (A = B). We then confine the notation A = B to stress cases where combinato- 
rial isomorphism results from some non-trivial transformation. 


Definition I.4. The ordinary generating function (OGF) of a sequence (A,) is the 
formal power series 


(7) AG) = > Anz”. 


n=0 


The ordinary generating function (OGF) of a combinatorial class A is the generating 
function of the numbers Ay = card(A,). Equivalently, the OGF of class A admits the 
combinatorial form 


(8) AG= > 2h. 
acA 


It is also said that the variable z marks size in the generating function. 


The combinatorial form of an OGF in (8) results straightforwardly from observing 
that the term z” occurs as many times as there are objects in A having size n. We stress 
the fact that, at this stage and throughout Part A, generating functions are manipulated 
algebraically as formal sums; that is, they are considered as formal power series (see 
the framework of Appendix A.5: Formal power series, p. 730) 


Naming convention. We adhere to a systematic naming convention: classes, their 
counting sequences, and their generating functions are systematically denoted by the 
same groups of letters: for instance, A for a class, {A,} (or {a,}) for the counting 
sequence, and A(z) (or a(z)) for its OGF. 

Coefficient extraction. We \et generally [z”] f (z) denote the operation of extract- 
ing the coefficient of z” in the formal power series f(z) = > fnz”, so that 


(9) [2"] | >) faz” | = fa 


n>0 


e coefficient extractor |z z) reads as coefficient of Zz” 1n Lai te 
(The coeffici [z”] ds as “coefficient of z” i aS 
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Ich bin neulich auf eine Betrachtung gefallen, 
welche mir nicht wenig merkwiirdig vorkam. 
Dieselbe betrifft, auf wie vielerley Arten ein 
gegebenes polygonum durch Diagonallinien in 
triangula zerchnitten werden kénne. 


Setze ich nun die Anzahl dieser verschiedenen 
Arten = x [...]. Hieraus habe ich nun den 
Schluss gemacht, dass generaliter sey 
2.6.10.14....(4n — 10) 
2.3.4.5....(n — 1) 
[...] | Ueber die Progression der Zahlen 
1, 2,5, 14, 42, 132, etc. habe ich auch diese 
Eigenschaft angemerket, dass 1 + 2a + 5a? + 
14a3 + 42a4 + 13205 + etc, = 1=2a=v1-4a 


1. On September 4, 1751, Euler writes to his friend Goldbach [196]: 


I have recently encountered a question, which 
appears to me rather noteworthy. It concerns 
the number of ways in which a given [convex] 
polygon can be decomposed into triangles by 
diagonal lines. 


Euler then describes the problem (for an n—gon, i.e., (n — 2) triangles) and concludes: 


Let me now denote by x this number of ways 
[...]. I have then reached the conclusion that 
in all generality 
2.6.10.14....(4n — 10) 
2.3.4.5....(n — 1) 

[...] Regarding the progression of the numbers 
1, 2,5, 14, 42, 132, and so on, I have also ob- 
served the following property: | + 2a + 5a2 + 
14a3 + 4204 + 13205 4 etc, = 1-24v'I=4a | 


2aa 


2aa 
Thus, as early as 1751, Euler knew the solution as well as the associated generating function. 
From his writing, it is however unclear whether he had found complete proofs. 


2. In the course of the 1750s, Euler communicated the problem, together with initial elements 
of the counting sequence, to Segner, who writes in his publication [146] dated 1758: “The 
great Euler has benevolently communicated these numbers to me; the way in which he found 
them, and the law of their progression having remained hidden to me” [“quos numeros mecum 
beneuolus communicauit summus Eulerus; modo, quo eos reperit, atque progressionis ordine, 
celatis” |. Segner develops a recurrence approach to Catalan numbers. By a root decomposition 
analogous to ours, on p. 35, he proves (in our notation, for decompositions into n triangles) 
n—-1 
Th = eae To = 1, 
k=0 
a recurrence by which the Catalan numbers can be computed to any desired order. (Segner’s 
work was to be reviewed in [197], anonymously, but most probably, by Euler.) 


(4) 


3. During the 1830s, Liouville circulated the problem and wrote to Lamé, who answered the 
next day(!) with a proof [399] based on recurrences similar to (4) of the explicit expression: 


1 2n 

n+1 ( n ) 

Interestingly enough, Lamé’s three-page note [399] appeared in the 1838 issue of the Jour- 

nal de mathématiques pures et appliquées (“Journal de Liouville”), immediately followed by 

a longer study by Catalan [106], who also observed that the 7, intervene in the number of 

ways of multiplying n numbers (this book, §I.5.3, p. 73). Catalan would then return to these 

problems [107, 108], and the numbers 1, 1, 2,5, 14,42, ... eventually became known as the 

Catalan numbers. In [107], Catalan finally proves the validity of Euler’s generating function: 
—_ n_ 1— J1—4z 

T(z) := DiInz =— 3 


(5) Th = 


(6) 


4. Nowadays, symbolic methods directly yield the generating function (6), from which both the 
recurrence (4) and the explicit form (5) follow easily; see pp. 6 and 35. 


Figure I.2. The prehistory of Catalan numbers. 
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HC 


HC 3 
=> CiHi4N2 ~ 2z 


CH2 


H.C CH2 


Figure 13. A molecule, methylpyrrolidinyl-pyridine (nicotine), is a complex as- 
sembly whose description can be reduced to a single formula corresponding here to a 
total of 26 atoms. 


The OGFs corresponding to our three examples W, P, 7 are then 


y 1 
Ww Zz = gnen = 
n=0 1—2z 
oo 
(10) P(z) = > ntz" 
n=0 
CO 
1 2 1-J1-4 
Te) => ( "yer = 2 
snort +1\n 2z 


The first expression relative to W(z) is immediate as it is the sum of a geometric 
progression. The second generating function P(z) is not clearly related to simple 
functions of analysis. (Note that the expression still makes sense within the strict 
framework of formal power series.) The third expression relative to T(z) is equivalent 
to the explicit form of T, via Newton’s expansion of (1 + x)!/? (pp. 7 and 35 as well 
as Figure I.2). The OGFs W(z) and T(z) can then be interpreted as standard analytic 
objects, upon assigning values in the complex domain C to the formal variable z. 
In effect, the series W(z) and T(z) converge in a neighbourhood of 0 and represent 
complex functions that are well defined near the origin, namely when |z| < 5 for W(z) 
and |z| < i for T(z). The OGF P(z) is a purely formal power series (its radius of 
convergence is 0) that can nonetheless be subjected to the usual algebraic operations 
of power series. (Permutation enumeration is most conveniently approached by the 
exponential generating functions developed in Chapter IT.) 


Combinatorial form of generating functions (GFs). The combinatorial form (8) 
shows that generating functions are nothing but a reduced representation of the com- 
binatorial class, where internal structures are destroyed and elements contributing to 
size (atoms) are replaced by the variable z. In a sense, this is analogous to what 
chemists do by writing linear reduced (“molecular”) formulae for complex molecules 
(Figure 1.3). Great use of this observation was made by Schiitzenberger as early as the 
1950s and 1960s. It explains the many formal similarities that are observed between 
combinatorial structures and generating functions. 
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pe ee es, Rg ee sores 
ZULE ZZ. Rez ZLLZ Z-RKKZ 2E% 
+z4 +2 427 +24 +z +21 +2 
A(z) = z+z7 4223 4324 


Figure I.4. A finite family of graphs and its eventual reduction to a generating function. 


Figure I.4 provides a combinatorial illustration: start with a (finite) family of 
graphs 7, with size taken as the number of vertices. Each vertex in each graph is 
replaced by the variable z and the graph structure is “forgotten”; then the monomials 
corresponding to each graph are formed and the generating function is finally obtained 
by gathering all the monomials. 

For instance, there are 3 graphs of size 4 in 7/, in agreement with the fact that 
[c+] H(z) = 3. If size had been instead defined by number of edges, another generating 
function would have resulted, namely, with y marking the new size: 1+ y+y?+2y?+ 
y*+y®. If both number of vertices and number of edges are of interest, then a bivariate 
generating function is obtained: H(z, y) = z+z2y+z3y?+z3yF+ztyrtztyttcty: 
such multivariate generating functions are developed systematically in Chapter III. 


A path often taken in the literature is to decompose the structures to be enumer- 
ated into smaller structures either of the same type or of simpler types, and then extract 
from such a decomposition recurrence relations that are satisfied by the {A,,}. In this 
context, the recurrence relations are either solved directly—whenever they are simple 
enough—or by means of ad hoc generating functions, introduced as mere technical 
artifices. 

By contrast, in the framework of this book, classes of combinatorial structures 
are built directly in terms of simpler classes by means of a collection of elementary 
combinatorial constructions. This closely resembles the description of formal lan- 
guages by means of grammars, as well as the construction of structured data types in 
programming languages. The approach developed here has been termed symbolic, as 
it relies on a formal specification language for combinatorial structures. Specifically, 
it is based on so-called admissible constructions that permit direct translations into 
generating functions. 


Definition I.5. Let ® be an m-ary construction that associates to any collection of 
classes BY, ...B™) anew class 


A= o[B™,..., BM). 


The construction ® is admissible iff the counting sequence (A,) of A only depends on 
the counting sequences (BY), or (Bw) of BO, 1. B®. 
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For such an admissible construction, there then exists a well-defined operator ‘Y 
acting on the corresponding ordinary generating functions: 
AG SUB G82, BO 1, 
and it is this basic fact about admissibility that will be used throughout the book. 


As an introductory example, take the construction of cartesian product, which is 
the usual one enriched with a natural notion of size. 
Definition 1.6. The cartesian product construction applied to two classes B and C 
forms ordered pairs, 


(1) A=BxC iff A={a=(f,7)|P EB, yy €C}, 
with the size of a pair a = (f, y ) being defined by 
(12) lala = lfla +lyle. 


By considering all possibilities, itis immediately seen that the counting sequences 
corresponding to A, B, C are related by the convolution relation 


n 
(13) An = > BiCn-%, 
k=0 


which means admissibility. Furthermore, we recognize here the formula for a product 
of two power series: 


(14) A(z) = B(z)-C(z). 


In summary: the cartesian product is admissible and it translates as a product of 
OGFs. 
Similarly, let A, B, C be combinatorial classes satisfying 


(15) A=BUC, with BNC=8, 


with size defined in a consistent manner: for @ € A, 


lols ifoeB 


(16) lolA = 

lalc ifwec. 
One has 
(17) An = Bnt+ Cn, 
which, at generating function level, means 
(18) A(z) = B(z) + C(z). 


Thus, the union of disjoint sets is admissible and it translates as a sum of generating 
functions. (A more formal version of this statement is given in the next section.) 
The correspondences provided by (11)—(14) and (15)-(18) are summarized by the 
strikingly simple dictionary 
A=BUC = A(z) = B(z)+C(z) (provided BNC = 9) 


A=BxC = A(z) = B(z)-C(z), 


(19) 
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to be compared with the plain arithmetic case of (3), p. 18. The merit of such rela- 
tions is that they can be stated as general purpose translation rules that only need to 
be established once and for all. As soon as the problem of counting elements of a 
union of disjoint sets or a cartesian product is recognized, it becomes possible to dis- 
pense altogether with the intermediate stages of writing explicitly coefficient relations 
or recurrences as in (13) or (17). This is the spirit of the symbolic method for com- 
binatorial enumerations. Its interest lies in the fact that several powerful set-theoretic 
constructions are amenable to such a treatment, as we see in the next section. 


> 1.3. Continuity, Lipschitz and Holder conditions. An admissible construction is said to be 
continuous if it is a continuous function on the space of formal power series equipped with its 
standard ultrametric distance (Appendix A.5: Formal power series, p. 730). Continuity captures 
the desirable property that constructions depend on their arguments in a finitary way. For all 
the constructions of this book, there furthermore exists a function v(m), such that (A,) only 


depends on the first 3 (n) elements of the (BY), itt (ay; with 3(n) < Kn + L (Holder 


condition) or J(n) < n+ L (Lipschitz condition). For instance, the functional f(z) H f (2?) 
is Holder; the functional f(z) 1» 0; f(z) is Lipschitz. 


I.2. Admissible constructions and specifications 


The main goal of this section is to introduce formally the basic constructions that 
constitute the core of a specification language for combinatorial structures. This core 
is based on disjoint unions, also known as combinatorial sums, and on cartesian prod- 
ucts that we have just discussed. We shall augment it by the constructions of sequence, 
cycle, multiset, and powerset. A class is constructible or specifiable if it can be de- 
fined from primal elements by means of these constructions. The generating function 
of any such class satisfies functional equations that can be transcribed systematically 
from a specification; see Theorems I.1 (p. 27) and I.2 (p. 33), as well as Figure 1.18 
(p. 93) at the end of this chapter for a summary. 


[.2.1. Basic constructions. First, we assume we are given a class € called the 
neutral class that consists of a single object of size 0; any such object of size 0 is 
called a neutral object and is usually denoted by symbols such as € or 1. The reason 
for this terminology becomes clear if one considers the combinatorial isomorphism 


AZ=ExAZ=AXE. 


We also assume as given an atomic class Z comprising a single element of size 1; 
any such element is called an atom; an atom may be used to describe a generic node 
in a tree or graph, in which case it may be represented by a circle (¢ or 0), but also a 
generic letter in a word, in which case it may be instantiated as a,b,c, .... Distinct 
copies of the neutral or atomic class may also be subscripted by indices in various 
ways. Thus, for instance, we may use the classes Z, = {a}, Z = {b} (with a,b 
of size 1) to build up binary words over the alphabet {a, b}, or Z, = {e}, Z = fo} 
(with e, o taken to be of size 1) to build trees with nodes of two colours. Similarly, 
we may introduce €y, €1, €2 to denote a class comprising the neutral objects 0, €1, €2 
respectively. 

Clearly, the generating functions of a neutral class € and an atomic class Z are 


E(z)=1, Z(z) =z, 
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corresponding to the unit 1, and the variable z, of generating functions. 


Combinatorial sum (disjoint union). The intent of combinatorial sum also known 
as disjoint union is to capture the idea of a union of disjoint sets, but without any ex- 
traneous condition (disjointness) being imposed on the arguments of the construction. 
To do so, we formalize the (combinatorial) sum of two classes B and C as the union 
(in the standard set-theoretic sense) of two disjoint copies, say BU and C°, of B and 
C. A picturesque way to view the construction is as follows: first choose two distinct 
colours and repaint the elements of B with the first colour and the elements of C with 
the second colour. This is made precise by introducing two distinct “markers”, say 
and ©, each a neutral object (i.e., of size zero); the disjoint union B+ C of B, C is then 
defined as a standard set-theoretic union: 


B+C:=({O} x B)U (O} x0). 


The size of an object in a disjoint union A = B + C is by definition inherited from its 
size in its class of origin, as in Equation (16). One good reason behind the definition 
adopted here is that the combinatorial sum of two classes is always well defined, no 
matter whether or not the classes intersect. Furthermore, disjoint union is equivalent 
to a standard union whenever it is applied to disjoint sets. 

Because of disjointness of the copies, one has the implication 


A=B+C = An=Bynt+C, and A(z) = B(z)+C(z), 


so that disjoint union is admissible. Note that, in contrast, standard set-theoretic union 
is not an admissible construction since 


card(B, UC,) = card(B,) + card(C,,) — card(B, 1 C,), 


and information on the internal structure of 6 and C (i.e., the nature of their intersec- 
tion) is needed in order to be able to enumerate the elements of their union. 

Cartesian product. This construction A = B x C forms all possible ordered pairs 
in accordance with Definition I.6. The size of a pair is obtained additively from the 
size of components in accordance with (12). 


Next, we introduce a few fundamental constructions that build upon set-theoretic 
union and product, and form sequences, sets, and cycles. These powerful construc- 
tions suffice to define a broad variety of combinatorial structures. 

Sequence construction. If B is aclass then the sequence class SEQ(S) is defined 
as the infinite sum 


SEQ(B) = {fe} +B+ (Bx B)4+(8Bx Bx B)+--- 
with € being a neutral structure (of size 0). In other words, we have 


A= {(B1,...,Be) | £20, Bj € B}, 


which matches our intuition as to what sequences should be. (The neutral structure in 
this context corresponds to € = 0; it plays a rdle similar to that of the “empty” word in 
formal language theory.) It is then readily checked that the construction A = SEQ(B) 
defines a proper class satisfying the finiteness condition for sizes if and only if B 
contains no object of size 0. From the definition of size for sums and products, it 
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follows that the size of an object a € A is to be taken as the sum of the sizes of its 
components: 


a= (fi,..., Be) = la] = [Ail +--+ + Bel. 
Cycle construction. Sequences taken up to a circular shift of their components 
define cycles, the notation being Cyc(B). In precise terms, one has? 
Cyc(B) := (SEQ(B) \ {e}) /S, 


where S is the equivalence relation between sequences defined by 


(Bis 2x:3 Pr) 8 Bis 5 Be) 
iff there exists some circular shift t of [1 ..7] such that for all /, Bi = f,(;); in other 


words, for some d, one has Bi = B14(j-1+4a) mod r. Here is, for instance, a depiction 


of the cycles formed from the 8 and 16 sequences of lengths 3 and 4 over two types of 
objects (a, b): the number of cycles is 4 (for n = 3) and 6 (for n = 4). Sequences are 
grouped into equivalence classes according to the relation S: 


aaaa 
aaa aaab aaba abaa baaa 
5) igen aab aba baa ri feds: aabb abba bbaa baab 
(20) 3—cycles: ) abb bba bab ° —cycles : abab baba 
bbb abbb bola beak babb 


According to the definition, this construction corresponds to the formation of directed 
cycles (see also the necklaces of Note I.1, p. 18). We make only a limited use of it 
for unlabelled objects; however, its counterpart plays a rather important rdle in the 
context of labelled structures and exponential generating functions of Chapter IL. 

Multiset construction. Following common mathematical terminology, multisets 
are like finite sets (that is the order between elements does not count), but arbitrary 
repetitions of elements are allowed. The notation is A = MSET(B) when A is ob- 
tained by forming all finite multisets of elements from B. The precise way of defining 
MSET(B) is as a quotient: 


MSET(B) := SEQ(L)/R_ with R, 


the equivalence relation of sequences being defined by (a1,...,a,-) R(f1,..., B,) iff 
there exists some arbitrary permutation o of [1..r] such that for all j, Bj = a,j). 

Powerset construction. The powerset class (or set class) A = PSET(B) is de- 
fined as the class consisting of all finite subsets of class 5, or equivalently, as the class 
PSET(L) C MSET(B) formed of multisets that involve no repetitions. 


We again need to make explicit the way the size function is defined when such 
constructions are performed: as for products and sequences, the size of a composite 
object—set, multiset, or cycle—is defined to be the sum of the sizes of its components. 


> 1.4. The semi-ring of combinatorial classes. Under the convention of identifying isomor- 
phic classes, sum and product acquire pleasant algebraic properties: combinatorial sums and 
cartesian products become commutative and associative operations, e.g., 


(A+ B)+C=A4+(B+C), Ax (BxC)=(Ax B)xC, 
while distributivity holds, (A+B) x C =(AxC)4+(6xC). J 


3By convention, there are no “empty” cycles. 
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> 1.5. Natural numbers. Let Z := {e} with e an atom (of size 1). Then Z = SEQ(Z) \ 
{e} is a way of describing positive integers in unary notation: T = {e, ee, eee,...}. The 
corresponding OGF is J(z) =z/U—z)=zt+2274+4--:. dq 
> 1.6. Interval coverings. Let Z := {e} be as before. Then A = Z + (Z x Z) is a set of two 
elements, e and (e, e), which we choose to draw as {e, e—e}. Then C = SEQ(A) contains 


©, ©0, 0-0, ©0-0, 0-00, 000-0, eee... 


With the notion of size adopted, the objects of size n inC = SEQ(Z+ (2 x Z)) are (isomorphic 
to) the coverings of [0, n] by intervals (matches) of length either 1 or 2. The OGF 

C(z) = 1424227432 4524482°4 132942127 43428 455294... , 
is, as we shall see shortly (p. 42), the OGF of Fibonacci numbers. dq 


].2.2. The admissibility theorem for ordinary generating functions. This sec- 
tion is a formal treatment of admissibility proofs for the constructions that we have 
introduced. The final implication is that any specification of a constructible class 
translates directly into generating function equations. The translation of the cycle 
construction involves the Euler totient function g(k) defined as the number of integers 
in [1, k] that are relatively prime to k (Appendix A.1: Arithmetical functions, p. 721). 


Theorem I.1 (Basic admissibility, unlabelled universe). The constructions of union, 
cartesian product, sequence, powerset, multiset, and cycle are all admissible. The 
associated operators are as follows. 


Sum: A=B+C => A(z) = B(z)+C(z) 


Cartesian product: A=BxC => A(z) = B(z)-C(z) 


Sequence: A = SEQ(B) => A(z)= 1 BO 
[[a + z")Bn 
Powerset: A=PSeET(B) => A(z)= 4 "7! 
CO _4y\k-1 
an (SO ae) 
k=1 
[iG-2* 
Multiset: A=MSetT(B) = A(z) =} "7! 
w0( 3 fee) 
k=1 
+ 9k) 1 
Cycle: A=Cyc(B) = A(zZ)= pa : log Tae 


For the sequence, powerset, multiset, and cycle translations, it is assumed that By = 9. 
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The class € = {€} consisting of the neutral object only, and the class Z consisting of 
a single “atomic” object (node, letter) of size 1 have OGFs 


E(z)=1 and Z(z) = Z. 
Proof. The proof proceeds case by case, building upon what we have just seen regard- 
ing unions and products. 
Combinatorial sum (disjoint union). Let A = B+C. Since the union is disjoint, 
and the size of an A-element coincides with its size in B or C, one has Ay = By + Cy 


and A(z) = B(z) + C(z), as discussed earlier. The rule also follows directly from the 
combinatorial form of generating functions as expressed by (8), p. 19: 


A(z) = ¥ zal = ye + ys" = Biz) + C(z). 
aeA aeB aeC 


Cartesian product. The admissibility result for A = 6 x C was considered as 
an example for Definition 1.6, the convolution equation (13) leading to the relation 
A(z) = B(z)- C(z). We can also offer a direct derivation based on the combinatorial 
form of generating functions (8), p. 19, 


A= Sita So AS (I) x (2!) = Be-cO, 


acA (B,y )e(BxC) BeB yeC 
as follows from distributing products over sums. This derivation readily extends to an 
arbitrary number of factors. 
Sequence construction. Admissibility for A = SEQ(B) (with By = 9) follows 
from the union and product relations. One has 
A={e}+B4+(BxB)+(BxBxB)+--:-, 
so that 


A(@) = 1+ BQ) +B]? + BW) ++ = a" 


where the geometric sum converges in the sense of formal power series since [z°]B(z) = 
0, by assumption. 

Powerset construction. Let A = PSET(S) and first take G to be finite. Then, the 
class A of all the finite subsets of B is isomorphic to a product, 


(21) PSET(B) = | | (fe) + (4), 
BEB 
with € a neutral structure of size 0. Indeed, distributing the products in all possible 
ways forms all the possible combinations (sets with no repetition allowed) of elements 
of 6; the reasoning is the same as what leads to an identity such as 
d+ad+b)d+c)=1+[a+b+c]+ [ab+ bc +ac]+abc, 


where all combinations of variables appear in monomials. Then, directly from the 
combinatorial form of generating functions and the sum and product rules, we find 


(22) A(z) = [Ja + z/Bl) = []a a zy Bn, 
BEB n 
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The exp—log transformation A(z) = exp(log A(z)) then yields 


CO 
exp (> By, log(1 + z") 


n=1 


CO oO nk 
ex] B,- —1 mie) 
pm Devers 


Ge B(22) B(z3) ) 
Xp _— — ieee 5 


A(z) 


(23) 


1 2 3 


where the second line results from expanding the logarithm, 


han 


u 


and the third line results from exchanging the order of summations. 

The proof finally extends to the case of B being infinite by noting that each A, 
depends only on those B; for which j < n, to which the relations given above for the 
finite case apply. Precisely, let B=”) = S“7_, B; and AS” = PSET(BS”"). Then, 
with O(z"*!) denoting any series that has no term of degree < m, one has 


A(z) = AE™(z) + O™*1) and B(z) = BS™ (z) + O(z"*), 


On the other hand, A”) (z) and B‘”)(z) are connected by the fundamental expo- 
nential relation (23) , since B‘<”") is finite. Letting m tend to infinity, there follows in 
the limit 


2 3 
A(z) = exp( 2 = ae ) + ae )_ ). 


(See Appendix A.5: Formal power series, p. 730 for the notion of formal conver- 
gence.) 

Multiset construction. First for finite 6 (with Bp = 9), the multiset class A = 
MSET(B) is definable by 


(24) MSeEtT(B) = I] SEQ({Z}). 
BEB 


In words, any multiset can be sorted, in which case it can be viewed as formed of a 
sequence of repeated elements £;, followed by a sequence of repeated elements fo, 
where /), fo, ... is a canonical listing of the elements of 6. The relation translates 
into generating functions by the product and sequence rules, 


A(z) = I] (a_- zlBly—l = [[a = zt) Bn 
beB n=1 
(25) = oo( > B, log(1 =) zy) 


n=1 


= exp( “2 B(2”) 7) 4...), 


1 2 3 
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where the exponential form results from the exp—log transformation. The case of an 

infinite class B follows by a limit argument analogous the one used for powersets. 
Cycle construction. The translation of the cycle relation A = Cyc(B) turns out 

to be 

1 


1 — B(z*)’ 


Sok 
A(z) = ye o ) log 
k=1 


where g(k) is the Euler totient function. The first terms, with Lz(z) := logd — 
B(z*))~! are 


1 1 2 2 4 2 
A(z) = 1 & + 5 L2() + 3/3) + qh4) + 5 bs) + G Lol) + bine 


We reserve the proof to Appendix A.4: Cycle construction, p. 729, since it relies in 
part on multivariate generating functions to be officially introduced in Chapter III. 


The results for sets, multisets, and cycles are particular cases of the well-known 
Polya theory that deals more generally with the enumeration of objects under group 
symmetry actions; for Pélya’s original and its edited version, see [488, 491]. This 
theory is described in many textbooks, for instance, those of Comtet [129] and Harary 
and Palmer [129, 319]; Notes I.58-I.60, pp. 85-86, distil its most basic aspects. The 
approach adopted here amounts to considering simultaneously all possible values of 
the number of components by means of bivariate generating functions. Powerful gen- 
eralizations within Joyal’s elegant theory of species [359] are presented in the book 
by Bergeron, Labelle, and Leroux [50]. 
> 17. Vallée’s identity. Let M = MSET(C), P = PSET(C). One has combinatorially: 


M(z) = P(z)M(2’). 

(Hint: a multiset contains elements of either odd or even multiplicity.) Accordingly, one can 
deduce the translation of powersets from the formula for multisets. Iterating the relation above 
yields M(z) = P(z)P(z2)P(z*) P(z8)---: this is closely related to the binary representation 
of numbers and to Euler’s identity (p. 49). It is used for instance in Note I.66 p. 91. dq 

Restricted constructions. In order to increase the descriptive power of the frame- 
work of constructions, we ought to be able to allow restrictions on the number of 
components in sequences, sets, multisets, and cycles. Let & be a metasymbol rep- 
resenting any of SEQ, Cyc, MSET, PSET and let Q be a predicate over the integers; 
then Re(A) will represent the class of objects constructed by &, with a number of 
components constrained to satisfy Q. For instance, the notation 


(26) SEQz, (or simply SEQ;), SEQ x, SEQ). .& 


refers to sequences whose number of components are exactly k, larger than k, or in 
the interval | ..k respectively. In particular, 
k times 
k j ~ xek 
SEQ; (B) := Bx---x B= BY, SEQ>;(B) = >_ B! = BE x SEQ(B), 
jk 

MSET;(B) = SEQ; (B)/R. 
Similarly, SEQogq, SEQeven Will denote sequences with an odd or even number of com- 
ponents, and so on. 
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Translations for such restricted constructions are available, as shown generally 
in Subsection I. 6.1, p. 83. Suffice it to note for the moment that the construction 
A = SEQ;(B) is really an abbreviation for a k-fold product, hence it admits the 
translation into OGFs 


(27) A = SEQ; (B) = A(z) = B(z)*. 


I.2.3. Constructibility and combinatorial specifications. By composing basic 
constructions, we can build compact descriptions (specifications) of a broad variety of 
combinatorial classes. Since we restrict attention to admissible constructions, we can 
immediately derive OGFs for these classes. Put differently, the task of enumerating a 
combinatorial class is reduced to programming a specification for it in the language of 
admissible constructions. In this subsection, we first discuss the expressive power of 
the language of constructions, then summarize the symbolic method (for unlabelled 
classes and OGFs) by Theorem I[.2. 


First, in the framework just introduced, the class of all binary words is described 

by 
W = SEQ(A), where A= {a,b}= 24+ 2, 

the ground alphabet, comprises two elements (letters) of size 1. The size of a binary 
word then coincides with its length (the number of letters it contains). In other terms, 
we start from basic atomic elements and build up words by forming freely all the ob- 
jects determined by the sequence construction. Such a combinatorial description of a 
class that only involves a composition of basic constructions applied to initial classes 
€, Z is said to be an iterative (or non-recursive) specification. Other examples al- 
ready encountered include binary necklaces (Note I.1, p. 18) and the positive integers 
(Note 1.5, p. 27) respectively defined by 


N =Cyc(Z + Z) and = Z=SEQs)(Z). 
From this, one can construct ever more complicated objects. For instance, 
P = MSET(Z) = MSET(SEQS|(Z)) 


means the class of multisets of positive integers, which is isomorphic to the class of 
integer partitions (see Section I. 3 below for a detailed discussion). As such examples 
demonstrate, a specification that is iterative can be represented as a single term built on 
€, Z and the constructions +, x, SEQ, Cyc, MSET, PSET. An iterative specification 
can be equivalently listed by naming some of the subterms (for instance, partitions in 
terms of natural integers Z, themselves defined as sequences of atoms Z). 


Semantics of recursion. We next turn our attention to recursive specifications, 
starting with trees (cf also Appendix A.9: Tree concepts, p. 737, for basic definitions). 
In graph theory, a tree is classically defined as an undirected graph that is connected 
and acyclic. Additionally, a tree is rooted if a particular vertex is specified (this vertex 
is then kown as the root). Computer scientists commonly make use of trees called 
plane‘ that are rooted but also embedded in the plane, so that the ordering of subtrees 


4The alternative terminology “planar tree” is also often used, but it is frowned upon by some as 
incorrect (all trees are planar graphs). We have thus opted for the expression “plane tree”, which parallels 
the phrase “plane curve”. 
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attached to any node matters. Here, we will give the name of general plane trees to 
such rooted plane trees and call G their class, where size is the number of vertices; 
see, e.g., reference [538]. (The term “general” refers to the fact that all nodes degrees 
are allowed.) For instance, a general tree of size 16, drawn with the root on top, is: 


As a consequence of the definition, if one interchanges, say, the second and third root 
subtrees, then a different tree results—the original tree and its variant are not equiva- 
lent under a smooth deformation of the plane. (General trees are thus comparable to 
graphical renderings of genealogies where children are ordered by age.). Although we 
have introduced plane trees as two-dimensional diagrams, it is obvious that any tree 
also admits a linear representation: a tree t with root ¢ and root subtrees 7), ..., 7; 
(in that order) can be seen as the object ¢| 71, ..., Tt; | where the box encloses similar 


representations of subtrees. Typographically, a box | - | may be reduced to a matching 
pair of parentheses, “(-)”, and one gets in this way a linear description that illustrates 
the correspondence between trees viewed as plane diagrams and functional terms of 
mathematical logic and computer science. 

Trees are best described recursively. A plane tree is a root to which is attached 
a (possibly empty) sequence of trees. In other words, the class G of general trees is 
definable by the recursive equation 


(28) G = Z x SEQ(G), 
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where Z comprises a single atom written “e” that represents a generic node. 

Although such recursive definitions are familiar to computer scientists, the speci- 
fication (28) may look dangerously circular to some. One way of making good sense 
of it is via an adaptation of the numerical technique of iteration. Start with G!! = @, 
the empty set, and define successively the classes 


git =2Z~x SEQ(GH), 
For instance, G!!] = Z x SEQ(@) = {(e, €)} = {e} describes the tree of size 1, and 


g@l = {o, oe], eee], ee ee], ...} 
gel = @, ee], ecoe|, ele ee], ..., 
elele||, elelee||, eljeeje|, eeleelee||,... 


First, each G!/! is well defined since it corresponds to a purely iterative specification. 
Next, we have the inclusion G Wl c gut (a simple interpretation of G"] is the class 
of all trees of height < j). We can therefore regard the complete class G as defined by 
the limit of the G!/]; that is, G := Uj; gl], 


> 1.8. Lim-sup of classes. Let {AU]} be any increasing sequence of combinatorial classes, in 
the sense that Al/] c A+!) and the notions of size are compatible. If Ale! = Uj Alisa 
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combinatorial class (there are finitely many elements of size n, for each n), then the correspond- 
ing OGFs satisfy A!®l(z) = lim j-3 00 Al/1(z) in the formal topology (Appendix A.5: Formal 
power series, p. 730). dq 


Definition I.7. A specification for an r—tuple A = (AQ,..., A”) of classes is a 
collection of r equations, 


AD = 0(AM,..., AM) 
2 = 1 r 
AM = 0,(AM,...,AM) 


where each ; denotes a term built from the A using the constructions of disjoint 
union, cartesian product, sequence, powerset, multiset, and cycle, as well as the initial 
classes E (neutral) and Z (atomic). 


We also say that the system is a specification of A“). A specification for a com- 
binatorial class is thus a sort of formal grammar defining that class. Formally, the sys- 
tem (29) is an iterative or non-recursive specification if it is strictly upper-triangular, 
that is, A is defined solely in terms of initial classes Z, €; the definition of Ae} 
only involves A), and so on; in that case, by back substitutions, it is apparent that for 
an iterative specification, A can be equivalently described by a single term involv- 
ing only the initial classes and the basic constructors. Otherwise, the system is said to 
be recursive. In the latter case, the semantics of recursion is identical to the one intro- 
duced in the case of trees: start with the “empty” vector of classes, Al?! := (@,..., 8), 
iterate AU+] = [AU !], and finally take the limit. 

There is an alternative and convenient way to visualize these notions. Given a 
specification of the form (29), we can associate its dependency (di)graph T to it as 
follows. The set of vertices of I is the set of indices {1,...,7}; for each equation 
AO = 5;(A®,...,A™) and for each j such that AY appears explicitly on the 
right-hand side of the equation, place a directed edge (( — j) in I. It is then eas- 
ily recognized that a class is iterative if the dependency graph of its specification is 
acyclic; it is recursive is the dependency graph has a directed cycle. (This notion will 
serve to define irreducible linear systems, p. 341, and irreducible polynomial systems, 
p. 482, which enjoy strong asymptotic properties.) 


Definition 1.8. A class of combinatorial structures is said to be constructible or speci- 
fiable iff it admits a (possibly recursive) specification in terms of sum, product, se- 
quence, set, multiset, and cycle constructions. 


At this stage, we have therefore available a specification language for combina- 
torial structures which is some fragment of set theory with recursion added. Each 
constructible class has by virtue of Theorem I.1 an ordinary generating function for 
which functional equations can be produced systematically. (In fact, it is even possible 
to use computer algebra systems in order to compute it automatically! See the article 
by Flajolet, Salvy, and Zimmermann [255] for the description of such a system.) 


Theorem I.2 (Symbolic method, unlabelled universe). The generating function of a 
constructible class is a component of a system of functional equations whose terms 
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are built from 


1,z, +, x, Q, Exp, Exp,Log, 


where 
} = (k) 1 
OF] = === Loglf] = log 
i= 2. iS PG) 
&. k oe) k 
Expif] = exp( 4 = , Explf] = exp yer 
k=1 k=1 


Polya operators. The operator Q translating sequences (SEQ) is classically known 
as the quasi-inverse. The operator Exp (multisets, MSET) is called the Pélya exponen- 
tial’ and Exp (powersets, PSET) is the modified Pélya exponential. The operator Log 
is the Pélya logarithm. They are named after Polya who first developed the general 
enumerative theory of objects under permutation groups (pp. 85-86). 


The statement of Theorem I.2 signifies that iterative classes have explicit gen- 
erating functions involving compositions of the basic operators only, while recursive 
structures have OGFs that are accessible indirectly via systems of functional equa- 
tions. As we shall see at various places in this chapter, the following classes are con- 
structible: binary words, binary trees, general trees, integer partitions, integer com- 
positions, non-plane trees, polynomials over finite fields, necklaces, and wheels. We 
conclude this section with a few simple illustrations of the symbolic method expressed 
by Theorem I.2. 


Binary words. The OGF of binary words, as seen already, can be obtained di- 
rectly from the iterative specification, 


1 
W = SEQ(Z + Z) => W(z) = —— 
1-—2z 


whence the expected result, W, = 2”. (Note: in our framework, if a, b are letters, 
then Z + Z = {a, b}.) 

General trees. The recursive specification of general trees leads to an implicit 
definition of their OGF, 


G=ZxSEQG) => CO= = ca 


From this point on, basic algebra° does the rest. First the original equation is equivalent 
(in the ring of formal power series) to G — G* — z = 0. Next, the quadratic equation 


STt is a notable fact that, although the Polya operators look algebraically “difficult” to compute with, 
their treatment by complex asymptotic methods, as regards coefficient asymptotics, is comparatively “easy”. 
We shall see many examples in Chapters IV-VII (e.g., pp. 252, 475). 

6Methodological note: for simplicity, our computation is developed using the usual language of math- 
ematics. However, analysis is not needed in this derivation, and operations such as solving quadratic equa- 
tions and expanding fractional powers can all be cast within the purely algebraic framework of formal power 
series (p. 730). 
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is solvable by radicals, and one finds 


G@) = 4(1-V1—4z) 
ztz7+2279 +574 + 142° +4279 + 13227 +429 28 +... 


1 f2n=2 
>-(7 )e" 
n\n-1 


n>1 


(The conjugate root is to be discarded since it involves a term z~! as well as negative 


coefficients.) The expansion then results from Newton’s binomial expansion, 
a(a—1 
) 2 


a 
1 = 14 -— 
(1+ x) et y 


tere, 


applied with a = 5 and x = —4z. 

The numbers 
(7) an. saméer eGso- 2) ee 
n+1 (n+ 1)!n! 2z 
are known as the Catalan numbers (EZ7S A000108) in the honour of Eugéne Catalan, 
the mathematician who first studied their properties in geat depth (pp. 6 and 20). In 
summary, general trees are enumerated by Catalan numbers: 


1 (2n —2 
Gp = Ch-1 = - . 
n 


n—1l 


(30) Ch = 


n 


For this reason the term Catalan tree is often employed as synonymous to “general 
(rooted unlabelled plane) tree”. 


Triangulations. Fix n + 2 points arranged in anticlockwise order on a circle and 
conventionally numbered from 0 to n + 1 (for instance the (n + 2)th roots of unity). 
A triangulation is defined as a (maximal) decomposition of the convex (n + 2)-gon 
defined by the points into n triangles (Figure I.1, p. 17). Triangulations are taken here 
as abstract topological configurations defined up to continuous deformations of the 
plane. The size of the triangulation is the number of triangles; that is, n. Given a 
triangulation, we define its “root” as a triangle chosen in some conventional and un- 
ambiguous manner (e.g., at the start, the triangle that contains the two smallest labels). 
Then, a triangulation decomposes into its root triangle and two subtriangulations (that 
may well be “empty”) appearing on the left and right sides of the root triangle; the 
decomposition is illustrated by the following diagram: 


Q 


The class 7 of all triangulations can be specified recursively as 
T = {ef} + (@WxVxT), 


36 I, COMBINATORIAL STRUCTURES AND ORDINARY GENERATING FUNCTIONS 


provided that we agree to consider a 2-gon (a segment) as giving rise to an “empty” 
triangulation of size 0. (The subtriangulations are topologically and combinatorially 
equivalent to standard ones, with vertices regularly spaced on a circle.) Consequently, 
the OGF T (z) satisfies the equation 


1 
G1) T@)=1+42TG@), — sothat TQ) = = (1 =a =4). 
xz 
As a result of (30) and (31), triangulations are enumerated by Catalan numbers: 


1 2n 
te OP ea oF 


This particular result goes back to Euler and Segner, a century before Catalan; see 
Figure I.1 on p. 17 for first values and p. 73 below for related bijections. 


> 1.9. A bijection. Since both general trees and triangulations are enumerated by Catalan 
numbers, there must exist a size-preserving bijection between the two classes. Find one such 
bijection. [Hint: the construction of triangulations is evocative of binary trees, while binary 
trees are themselves in bijective correspondence with general trees (p. 73).] dq 


> 1.10. A variant specification of triangulations. Consider the class U/ of “non-empty” triangu- 
lations of the n-gon, that is, we exclude the 2-gon and the corresponding “empty” triangulation 
of size 0. Then U/ = T \ {e} admits the specification 


U=V+VxXU4+UKV)4+UxVxU) 
which also leads to the Catalan numbers via U = z(1 + u)?, so that U(z) = U -— 2z - 


V1 = 42)/Qz) = T(z) -1. 


1.2.4. Exploiting generating functions and counting sequences. In this book 
we are going to see altogether more than a hundred applications of the symbolic 
method. Before engaging in technical developments, it is worth inserting a few com- 
ments on the way generating functions and counting sequences can be put to good use 
in order to solve combinatorial problems. 


Explicit enumeration formulae. In a number of situations, generating functions 
are explicit and can be expanded in such a way that explicit formulae result for their 
coefficients. A prime example is the counting of general trees and of triangulations 
above, where the quadratic equation satisfied by an OGF is amenable to an explicit 
solution—the resulting OGF could then be expanded by means of Newton’s binomial 
theorem. Similarly, we derive later in this chapter an explicit form for the number 
of integer compositions by means of the symbolic method (the answer turns out to 
be simply 2”~!) and obtain in this way, through OGFs, many related enumeration 
results. In this book, we assume as known the elementary techniques from basic 
calculus by which the Taylor expansion of an explicitly given function can be obtained. 
(Elementary references on such aspects are Wilf’s Generatingfunctionology [608], 
Graham, Knuth, and Patashnik’s Concrete Mathematics [307], and our book [538].) 


Implicit enumeration formulae. In a number of cases, the generating functions 
obtained by the symbolic method are still in a sense explicit, but their form is such that 
their coefficients are not clearly reducible to a closed form. It is then still possible to 
obtain initial values of the corresponding counting sequence by means of a symbolic 
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manipulation system. Furthermore, from generating functions, it is possible systemat- 
ically to derive recurrences that lead to a procedure for computing an arbitrary number 
of terms of the counting sequence in a reasonably efficient manner. A typical example 
of this situation is the OGF of integer partitions, 


Tl 1 

m=1l es am 
for which recurrences obtained from the OGF and associated to fast algorithms are 
given in Note I.13 (p. 42) and Note I.19 (p. 49). An even more spectacular example 
is the OGF of non-plane trees, which is proved below (p. 71) to satisfy the infinite 
functional equation 


H(z) = zexp (#0 - 5H) + 5H) ++ ), 


and for which coefficients are computable in low complexity: see Note 1.43, p. 72. 
(The references [255, 264, 456] develop a systematic approach to such problems.) 
The corresponding asymptotic analysis constitutes the main theme of Section VII.5, 
p. 475. 


Asymptotic formulae. Such forms are our eventual goal as they allow for an easy 
interpretation and comparison of counting sequences. From a quick glance at the 
table of initial values of W, (words), P, (permutations), 7, (triangulations), as given 
in (2), p. 18, itis apparent that W,, grows more slowly than T,,, which itself grows more 
slowly than P,. The classification of growth rates of counting sequences belongs prop- 
erly to the asymptotic theory of combinatorial structures which neatly relates to the 
symbolic method via complex analysis. A thorough treatment of this part of the the- 
ory is presented in Chapters [V—VIII. Given the methods expounded there, it becomes 
possible to estimate asymptotically the coefficients of virtually any generating func- 
tion, however complicated, that is provided by the symbolic method; that is, implicit 
enumerations in the sense above are well covered by complex asymptotic methods. 

Here, we content ourselves with a few remarks based on elementary real analysis. 
(The basic notations are described in Appendix A.2: Asymptotic notation, p. 722.) 
The sequence W,, = 2” grows exponentially and, in such an extreme simple case, the 
exact form coincides with the asymptotic form. The sequence P, = n! must grow 
faster. But how fast? The answer is provided by Stirling’s formula, an important 
approximation originally due to James Stirling (/nvitation, p. 4): 


(32) we (=) [an (: +0 (+) (2 —> 400). 


(Several proofs are given in this book, based on the method of Laplace, p. 760, Mellin 
transforms, p. 766, singularity analysis, p. 407, and the saddle-point method, p 555.) 
The ratios of the exact values to Stirling’s approximations 


n 1 2 5 10 100 1000 
T 
- 1.084437 1.042207 1.016783 1.008365 1.000833 1.000083 


nte—"./2nn 
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show an excellent quality of the asymptotic estimate: the error is only 8% for n = 1, 
less than 1% for n = 10, and less than 1 per thousand for any n greater than 100. 

Stirling’s formula provides in turn the asymptotic form of the Catalan numbers, 
by means of a simple calculation: 


1 (2n)! 1 (n)*"e7?"/4an 


G.= 
a (n!)2 xn n?"e-2227n 


which simplifies to 


(33) Ch ~ 


Thus, the growth of Catalan numbers is roughly comparable to an exponential, 4”, 
modulated by a subexponential factor, here 1//zn3. A surprising consequence of 
this asymptotic estimate in the area of boolean function complexity appears in Exam- 
ple 1.17 below (p. 77). 

Altogether, the asymptotic number of general trees and triangulations is well sum- 
marized by a simple formula. Approximations become more and more accurate as n 
becomes large. Figure I.5 illustrates the different growth regimes of our three ref- 
erence sequences while Figure I.6 exemplifies the quality of the approximation with 
subtler phenomena also apparent on the figures and well explained by asymptotic the- 
ory. Such asymptotic formulae then make comparison between the growth rates of 
sequences easy. 

The interplay between combinatorial structure and asymptotic structure is indeed 
the principal theme of this book. We shall see in Part B that the generating func- 
tions provided by the symbolic method typically admit similarly simple asymptotic 
coefficient estimates. 
> IL.11. The complexity of coding. A company specializing in computer-aided design has sold 
to you a scheme that (they claim) can encode any triangulation of size n > 100 using at most 


1.5n bits of storage. After reading these pages, what do you do? [Hint: sue them!] See also 
Note 1.24 (p. 53) for related coding arguments. dq 
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n Cn Ch Cr/ Cn 

1 2.25 2.25675 83341 91025 14779 23178 
10 16796 18707.89 1.11383 05127 5244589437 89064 
100 0.89651 - 10°7 0.90661 - 10°7 1.01126 32841 24540 52257 13957 
1000 0.20461 - 10°98 0.20484 - 10998 1.00112 51328 15424 16470 12827 


10000 0.22453 - 1909915 0.22456 - 199915 1.00011 25013 28127 92913 51406 
100000 0.17805 - 109199 9.17805 - 1069199 1.00001 12500 13281 25292 96322 
1000000 0.55303 - 10992051 9.55303 - 10692951 1.90000 11250 00132 81250 29296 


Figure 1.6. The Catalan numbers Cy, their Stirling approximation Cx = 4” /Vxn?, 
and the ratio C7/Cn. 


> 1.12. Experimental asymptotics. From the data of Figure I.6, guess the values’ of C io7/ Cj07 
and of CF 195/€5.106 to 25D. (See, Figure VI.3, p. 384, as well as, e.g., [385] for related 
asymptotic expansions and [80] for similar properties.) dq 


I.3. Integer compositions and partitions 


This section and the next few provide examples of counting via specifications in 
classical areas of combinatorial theory. They illustrate the benefits of the symbolic 
method: generating functions are obtained with hardly any computation, and at the 
same time, many counting refinements follow from a basic combinatorial construc- 
tion. The most direct applications described here relate to the additive decomposition 
of integers into summands with the classical combinatorial—arithmetic structures of 
partitions and compositions. The specifications are iterative and simply combine two 
levels of constructions of type SEQ, MSET, Cyc, PSET. 


I.3.1. Compositions and partitions. Our first examples have to do with decom- 
posing integers into sums. 
Definition I.9. A composition of an integer n is a sequence (x, X2,..., Xx) of integers 
(for some k) such that 


NX PXZT + Xk; xj 21. 
A partition of an integer n is a sequence (x1, X2,..., Xk) of integers (for some k) such 
that 
n=xj+xo+---+x% and Xp > x2 >--- > xp > 1. 


In both cases, the x; are called the summands or the parts and the quantity n is called 
the size. 
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By representing summands in unary using small discs (“‘e’”’), we can render graph- 
ically a composition by drawing bars between some of the balls; if we arrange sum- 
mands vertically, compositions appear as ragged landscapes. In contrast, partitions 
appear as staircases, also known as Ferrers diagrams [129, p. 100]; see Figure I.7. We 


In this book, we abbreviate a phrase such as ‘25 decimal places” by “25D”. 
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Figure I.7. Graphical representations of compositions and partitions: (left) the com- 
position 1+3+1+4+4+2+43 = 14 with its “ragged landscape” and “balls-and-bars” 
models; (right) the partition 8+ 8+6+5+4+4+4+4+42+41+41 = 43 with its 
staircase (Ferrers diagram) model. 


let C and P denote the class of all compositions and all partitions, respectively. Since 
a set can always be presented in sorted order, the difference between compositions and 
partitions lies in the fact that the order of summands does or does not matter. This is 
reflected by the use of a sequence construction (for C) against a multiset construction 
(for P). From this perspective, it proves convenient to regard 0 as obtained by the 
empty sequence of summands (k = 0), and we shall do so from now on. 


Integers, as a combinatorial class. Let ZT = {1, 2, ...} denote the combinatorial 
class of all integers at least 1 (the summands), and let the size of each integer be its 
value. Then, the OGF of Z is 

Z 
(34) iQ > 7 = 


i > 
n>1 3 


since J, = 1 forn > 1, corresponding to the fact that there is exactly one object in Z 
for each size n > 1. If integers are represented in unary, say by small balls, one has 


(35) T= ({l, 2, 3, ...} = {e, 0, eee, ...} = SEQsj{¢}, 
which constitutes a direct way to visualize the equality J(z) = z/(1 — z). 


Compositions. First, the specification of compositions as sequences admits, by 
Theorem I.1, a direct translation into OGF: 


(36) ad) O==46 
The collection of equations (34), (36) thus fully determines C (z): 
1 1- 
Ci) = = — 


1-75 1-2z 


z 


= 1474227 44234 8244 1675 + 327264... . 


From here, the counting problem for compositions is solved by a straightforward ex- 
pansion of the OGF: one has 


C(z) = Dees _ Does ola 


n>0 n>0 
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Oo 1 1 

10 = 1024 42 

20  =1048576 627 

30 =: 1073741824 5604 

40 1099511627776 37338 

50 = 1125899906842624 204226 

60 1152921504606846976 966467 

70: 1180591620717411303424 4087968 

80 = 1208925819614629174706176 15796476 

90 — 1237940039285380274899 124224 56634173 
100 = 1267650600228229401496703205376 190569292 
110 = 1298074214633706907 132624082305024 607163746 
120 — 1329227995784915872903807060280344576 1844349560 
130 =: 1361129467683753853853498429727072845824 5371315400 
140 — 1393796574908 163946345982392040522594 123776 15065878135 
150 — 1427247692705959881058285969449495 136382746624 40853235313 
160 = 14615016373309029 182036848327 16283019655932542976 107438159466 
170 =: 14965776766268445882405732687014738 12127674924007424 274768617130 
180 — 1532495540865888858358347027 150309 183618739122183602176 684957390936 
190 =: 15692754338466701909589473558019 1660402558886 1 16008628224 1667727404093 
200 = 160693804425899027554 196209234 1 162602522202993782792835301376 3972999029388 
210 = 1645504557321206042 154969 182557350504982735865633579863348609024 9275102575355 
220 —16849966666969 14987 1666884429387269 17 10232 1526408785780068975640576 21248279009367 
230 =: 1725436586697640946858688965569256363 1 1277724304259663879063 1055949824 47826239745920 


105882246722733 
23079355436468 1 


240 —17668470647783843295832975007429 185 15827483896875618958121606201292619776 
250 — 18092513943330655534932966407607485602073435 104006338 13116524750123642650624 


Figure 1.8. For n = 0, 10, 20, ..., 250 (left), the number of compositions C;, (mid- 
dle) and the number of partitions P, (right). The figure illustrates the difference in 


growth between Cy = 2"-1 and Py = eA(Vn), 


implying Co = 1 and C, = 2” — 2"! for n > 1; that is, 
(37) C, =2"!, n> 1. 


This agrees with basic combinatorics since a composition of n can be viewed as the 
placement of separation bars at a subset of the n — 1 existing places in between n 
aligned balls (the “balls-and-bars” model of Figure I.7), of which there are clearly 
2"! possibilities. 

Partitions. For partitions specified as multisets, the general translation mechan- 
ism of Theorem I.1, p. 27, provides 


(38) P=MSET(Z) =~ P(z)=exp (1 + st) + Ie) ee +) ; 


together with the product form corresponding to (25), p. 29, 
o.@) 


P= Tia 


m=1 


39 
me = (l+z4224+---) (424+ 4t--) Ct 242o+--)- 


= 1474227 +323 +5244 72? + 11z® + 15z? + 2278 +... 


(the counting sequence is EJS A000041). Contrary to compositions that are counted 
by the explicit formula 2”~!, no simple form exists for P,. Asymptotic analysis of 
the OGF (38) based on the saddle-point method (Chapter VIII, p. 574) shows that 
P, = eV"), In fact an extremely famous theorem of Hardy and Ramanujan later 
improved by Rademacher (see Andrews’ book [14] and Chapter VII) provides a full 
expansion of which the asymptotically dominant term is 


1 2n 


~ ——exp[ z,/— 
AnJ3 r, 


(40) Py 3 
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There are consequently appreciably fewer partitions than compositions (Figure 1.8). 
> 1.13. A recurrence for the partition numbers. Logarithmic differentiation gives 


P'(z) oo nz" n 
Z = » implying nP, = > Gay: 


1 — f 
n=1 ss p= 


where o(n) is the sum of the divisors of n (e.g., (6) = 1+2+3+46 = 12). Conse- 


quently, P;,..., Py can be computed in O(N) integer-arithmetic operations. (The technique 
is generally applicable to powersets and multisets; see Note I.43 (p. 72) for another application. 


Note 1.19 (p. 49) further lowers the bound to O(N VN), in the case of partitions.) dq 

By varying (36) and (38), we can use the symbolic method to derive a number of 
counting results in a straightforward manner. First, we state the following proposition. 
Proposition 1.1. Let JT C TZ be a subset of the positive integers. The OGFs of the 
classes C7 := SEQ(SEQ7(Z)) and PT := MSET(SEQ7(4)) of compositions and 
partitions having summands restricted to T C Zs, are given by 


1 1 1 
C7@)= = ,  P7@)= 
1—dner2” 1-T) Der 
Proof. A direct consequence of the specifications and Theorem I.1, p. 27. | 


This proposition permits us to enumerate compositions and partitions with re- 


stricted summands, as well as with a fixed number of parts. 
Example 1.4. Compositions with restricted summands. In order to enumerate the class C {1,2} 
of compositions of n whose parts are only allowed to be taken from the set {1, 2}, simply write 


ctl} = seq(zth4}) with Zfb2} = (1, 2}. 
Thus, in terms of generating functions, one has 


1 


cH) = —____ with 2g) 7422, 
(z) TT (z) 
This formula implies 
1 
Ct} g) = = 142422? +329 +524 + 82° + 132° +---, 


L327 


and the number of compositions of n in this class is expressed by a Fibonacci number, 


n n 
1.2 1 14/5 1-J/5 
Ch Fats where r= aI ( 7) ) -( ) , 


of daisy—artichoke-rabbit fame In particular, the rate of growth is of the exponential type g”, 


1 5 
where g := is the golden ratio. 
Similarly, compositions all of whose summands lie in the set {1, 2, ..., 7} have generating 
function 
1 1 1l—z 
4] Cle) = = = : 
ao ” Lager asi Pee laoreet 
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and the corresponding counts are generalized Fibonacci numbers. A double combinatorial sum 
expresses these counts 


oe z(l—z)\4 i\(n—-rk-1 

(42) cee =e") (Gree =piey 

- (1 —z) ; k j-1 

J jk 
This result is perhaps not too useful for grasping the rate of growth of the sequence when n gets 
large, so that asymptotic analysis is called for. Asymptotically, for any fixed r > 2, there is a 
unique root pry of the denominator 1 — 2z + z” +1 in Gs: 1), this root dominates all the other 
roots and is simple. Methods amply developed in Chapter IV and Example V.4 (p. 308) imply 
that, for some constant cy > 0, 


(43) ro ~crp," for fixed r as n > oo. 


The quantity p; plays a réle similar to that of the golden ratio when r = 2. ............... | 


> 1.14. Compositions into primes. The additive decomposition of integers into primes is still 
surrounded with mystery. For instance, it is not known whether every even number is the sum 
of two primes (Goldbach’s conjecture). However, the number of compositions of n into prime 
summands (any number of summands is permitted) is By = [z”]B(z) where 


-1 
-1 
1- 2 = (Q-2-2--77-z!!-...) 
P prime 
= 142242 4244329422%462' +628 4 1027 +1629 +... 


(EIS A023360), and complex asymptotic methods make it easy to determine the asymptotic 
form Bn ~ 0.30365 - 1.47622”; see Example V.2, p. 297. 


B(z) 


Example 1.5. Partitions with restricted summands (denumerants). Whenever summands are 
restricted to a finite set, the special partitions that result are called denumerants. A denumerant 
problem popularized by Polya [493, §3] consists in finding the number of ways of giving change 
of 99 cents using coins that are pennies (1 cent), nickels (5 cents), dimes (10 cents) and quarters 
(25 cents). (The order in which the coins are taken does not matter and repetitions are allowed.) 
For the case of a finite J, we predict from Proposition I.1 that PT (z) is always a rational 
function with poles that are at roots of unity; also the pr satisfy a linear recurrence related to 


the structure of T. The solution to the original coin change problem is found to be 
1 
99 
[277] = 213. 
Ce) eet) Cle ad | Ooo) 


In the same vein, one proves that 


2 
plb) S| plh23} _ aS | 


4 12 


here [x] =|x+ +I denotes the integer closest to the real number x. Such results are typically 
obtained by the two-step process: (i) decompose the rational generating function into simple 
fractions; (ii) compute the coefficients of each simple fraction and combine them to get the 
final result [129, p. 108]. 
The general argument also gives the generating function of partitions whose summands lie 
in the set {1,2,...,r} as 
; 


= zn” 
m=1 


44 I, COMBINATORIAL STRUCTURES AND ORDINARY GENERATING FUNCTIONS 


In other words, we are enumerating partitions according to the value of the largest summand. 
One then finds by looking at the poles (Theorem IV.9, p. 256): 


1 


ieee 2 . 
(45) Pree) ~ een! wit 7 FD! 


A similar argument provides the asymptotic form of pl when T is an arbitrary finite set: 


with 7 := I] n, r:=card(T). 
neT 
This last estimate, originally due to Schur, is proved in Proposition IV.2, p. 258. .......... a 


We next examine compositions and partitions with a fixed number of summands. 


Example 1.6. © Compositions with a fixed number of parts. Let Cc) denote the class of 
compositions made of k summands, k a fixed integer > 1. One has 


CY — sta, (Z) =I xIx--- xT, 
where the number of terms in the cartesian product is k. From here, the corresponding generat- 
ing function is found to be 
Zz 
liz 


CH) = (J ()* with  IJ(z)= 


The number of compositions of n having k parts is thus 


k 
(kK) _ pon Zz = n—-1 
n =wI- = ({7 |). 


a result which constitutes a combinatorial refinement of Cy, = gn], (Note that the formula 


c& = (as also results easily from the balls-and-bars model of compositions (Figure I.7)). 
In such a case, the asymptotic estimate eo ~ nk-l /(k — 1)! results immediately from the 
polynomial form of the binomial coefficient Gai) ania binaeg ate tate hese wens mera conus of | 


Example 1.7. Partitions with a fixed number of parts. Let PS‘) be the class of integer 
partitions with at most kK summands. With our notation for restricted constructions (p. 30), this 
class is specified as 


P(s) — MSer<;(Z). 


It would be possible to appeal to the admissibility of such restricted compositions as developed 
in Subsection I. 6.1 below, but the following direct argument suffices in the case at hand. Geo- 
metrically, partitions, are represented as collections of points: this is the staircase model of 
Figure I.7, p. 40. A symmetry around the main diagonal (also known in the specialized literature 
as conjugation) exchanges number of summands and value of largest summand; one then has 
(with earlier notations) 


so that, by (44), 
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As a consequence, the OGF of partitions with exactly k summands, pk) (z) = P(S4)(z) - 
P(Sk-D(z), evaluates to 


zk 


Oat a nate ead 
Given the equivalence between number of parts and largest part in partitions, the asymptotic 
estimate (45) applies verbatim here. ............. cece cece nen nee nen ena |_| 


Pz) = 


> 1.15. Compositions with summands bounded in number and size. The number of composi- 
tions of size n with k summands each at most r is expressible as 


" 1—z"\k 
"1 (=) , 


which reduces to a simple binomial convolution (the calculation is similar to (42), p. 43). < 


> 1.16. Partitions with summands bounded in number and size. The number of partitions of 
size n with at most k summands each at most ¢ is 


Gptae G2 ) 
(d—z)Q -2)---(— 24) - (1 - pd —24)---d-2)" 
(Verifying this by recurrence is easy.) The GF reduces to the binomial coefficient (as) as 
z — 1; it is known as a Gaussian binomial coefficient, denoted eae or a “q—analogue” of 
the binomial coefficient [14, 129]. <i 


The last example of this section illustrates the close interplay between combi- 
natorial decompositions and special function identities, which constitutes a recurrent 
theme of classical combinatorial analysis. 


[z"] 


Example 1.8. The Durfee square of partitions and stack polyominoes. The diagram of any 
partition contains a uniquely determined square (known as the Durfee square) that is maximal, 
as exemplified by the following diagram: 


a hm 
h- Lh 


This decomposition is expressed in terms of partition GFs as 


P=U (a x PS) x Pinch) ; 
h>0 


It gives automatically, via (44) and (46), a non-trivial identity, which is nothing but a formal 
rewriting of the geometric decomposition: 


CO 1 Zz 
| reese > 2 
nai 2 ps0 (A—-2)--- dz) 
(h is the size of the Durfee square, known to manic bibliometricians as the “H-index”’). 
Stack polyominoes. Here is a similar case illustrating the direct correspondence between 
geometric diagrams and generating functions, as afforded by the symbolic method. A stack 
polyomino is the diagram of a composition such that for some j,?, one has 1 < x] < x2 < 


h2 


46 I. COMBINATORIAL STRUCTURES AND ORDINARY GENERATING FUNCTIONS 


“Sox; > Xjq1 2 +++ S xe B 1 (see [552, §2.5] for further properties). The diagram 
representation of stack polyominoes 


translates immediately into the OGF 


S(z) = S ss 


k>1 Len (1 —z)\(l—z?)---(—- zkly)?” 


k 1 


once use is made of the partition GFs P{!>---4}(z) of (44). This last relation provides a bona fide 
algorithm for computing the initial values of the number of stack polyominoes (E/S A001523): 


SQ) =2t227 44248244 152° 4 272° 4.472' 47929 4..-. 


The book of van Rensburg [592] describes many such constructions and their relation to models 
of statistical physics, especially polyominoes. For instance, related “q—Bessel” functions appear 
in the enumeration of parallelogram polyominoes (Example IX.14, p. 660). .............. | 


> 1.17. Systems of linear diophantine inequalities. Consider the class F of compositions of 
integers into four summands (x1, x2, x3, x4) such that 
xy >0, x9 >2x,, x3 >2x2, xX4 > 2x3, 
where the x; are in Z9. The OGF is 
1 
Key hey ue! ele): 


Generalize to r > 4 summands (in Zs) and a similar system of inequalities. (Related GFs 
appear on p. 200.) Work out elementarily the OGFs corresponding to the following systems of 
inequalities: 


F(z)= 


{xy +x2 <%3}, {xy x2 >%x3}, {xp +22 S x3 +X4}, {41 S X20, x2 = 13,23 S x4}. 


More generally, the OGF of compositions into a fixed number of summands (in Zs), con- 
strained to satisfy a linear system of equations and inequalities with coefficients in Z, is ration- 


al; its denominator is a product of factors of the form (1 — z/). (Caution: this generalization is 
non-trivial: see Stanley’s treatment in [552, §4.6].) J 


Figure I.9 summarizes what has been learned regarding compositions and parti- 
tions. The way several combinatorial problems are solved effortlessly by the symbolic 
method is worth noting. 


I.3.2. Related constructions. It is also natural to consider the two constructions 
of cycle and powerset when these are applied to the set of integers Z. 
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Specification OGF coefficients 
Compositions: 
1-— 
all SEQ(SEQS |(Z)) = gn-l (p. 40) 
= — 2z 
1—z = 
parts<r SEQ(SEQ] __ (Z)) jog ee : (pp. 42, 308) 
zk nk-1 
k parts SEQ; (SE Z —_ ~ .44 
p Q, (SEQs 1 (Z)) a—oF (=) (p. 44) 
2 
cyclic CyYc(SEQs1(Z)) Eq. (48) Fe (p. 48) 
-_ n 
Partitions: 
oD 1 1 2n 
all MSET(SEQS | (Z)) Ga 27)\-! & eV 3 (pp. 41, 574) 
>I Il ears, pp 
F nto! 
parts<r  MSET(SEQ)..-(Z)) J] a-2")~! ~ ~——___ pp. 43, 258) 
bss rir —1)! 
i nk-1 
~w ny Ae Ba —— — 
<kparts &MSET(SEQ; ,4(2)) Il (l—z™) Wend! (pp. 44, 258) 
oo 33/4 4 
distinct parts PSET(SE Z 1+” ~ myn . 48, 579 
P (SEQs1(Z)) Ih Z) aaaee*"* (ee ) 


Figure I.9. Partitions and compositions: specifications, generating functions, and 
coefficients (in exact or asymptotic form). 


Cyclic compositions (wheels). The class D = CycC(Z) comprises compositions 
defined up to circular shift of the summands; so, for instance 2 +3+1+2+5, 
3+1+2+5-+ 2, etc, are identified. Alternatively, we may view elements of D as 
“wheels” composed of circular arrangements of rows of balls (taken up to rotation): 


e 
e 
e e e 
e e e 
e e 
e e e 
a “wheel” (cyclic composition) EO O Oe. 202828 
e e e 
e e 
e e e 
e 


By the translation of the cycle construction, the OGF is 


lee) k —1 

glk Zz 
D > ENE, aecees 
(47) @) A = loa (1 - 1- =) 


24227 432945244727 4+ 137294 1927 +3528 +. 
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The coefficients are thus (EJS A008965) 


1 1 2” 

(48) Dn=— 2 9pHO* —1I)S-1+— Dip H2ri~ —, 
where the condition “k | n” indicates a sum over the integers k dividing n. Notice that 
Dy, is of the same asymptotic order as Cn, which is suggested by circular symmetry 
of wheels, but there is a factor: Dy ~ 2C,/n. 

Partitions into distinct summands. The class Q = PSET(Z) is the subclass 
of P = MSET(Z) corresponding to partitions determined as in Definition 1.9, but 
with the strict inequalities x, > --- > x1, so that the OGF is 


(49) O@)=[[G4+2%) S14 cto $228 $227 432° +42 4527 ++. 
n>1 


The coefficients (EZS A000009) are not expressible in closed form. However, the 
saddle-point method (Section VIII. 6, p. 574) yields the approximation: 


33/4 i 
(50) On ~ Sz exp (~,/2) 


which has a shape similar to that of P,, in (40), p. 41. 


> 1.18. Odd versus distinct summands. The partitions of n into odd summands (O,) and the 
ones into distinct summands (Q,,) are equinumerous. Indeed, one has 


Q2)=[[d+2”, o@=[Ja-e yt. 
1 j=0 


m= 


Equality results from substituting (1 + a) = (1 — a*)/(1 — a) witha = 2”, 


O(2) 1-77 1-274 1-76 1—281-710 1 1 1 
Zz) = = tee, 
l-z 1-2 1-21-24 1-25 l-z1-2 1-25 
and simplification of the numerators with half of the denominators (in boldface). J 


Partitions into powers. Let T°” = {1, 2,4, 8, ...} be the set of powers of 2. The 
corresponding P and Q partitions have OGFs 


a od 
pow = 

Pray i= - iT ie 

i=0 
= 14+ 74222 +223 +44 + 429 + 62° + 627 + 1028 + --- 

CO 

gPw(zy) = ][][d+z”) 
i=0 


= ltzt2tetzegeot--. 


The first sequence 1, 1,2, 2,... is the “binary partition sequence” (EIS A018819); the 
difficult asymptotic analysis was performed by de Bruijn [141] who obtained an esti- 
mate that involves subtle fluctuations and is of the global form e0(loz"”) The function 
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QP°W(z) reduces to (1 — Zr since every number has a unique additive decomposition 
into powers of 2. Accordingly, the identity 


1 aa 
= [te ; 
j=0 


l1-z 3 


first observed by Euler is sometimes nicknamed the “computer scientist’s identity” as 
it reflects the property that every number admits a unique binary representation. 

There exists a rich set of identities satisfied by partition generating functions— 
this fact is down to deep connections with elliptic functions, modular forms, and 
q—analogues of special functions on the one hand, basic combinatorics and number 
theory on the other hand. See [14, 129] for introductions to this fascinating subject. 
> 1.19. Euler’s pentagonal number theorem. This famous identity expresses 1 / P(z) as 


[[c aX z) = Si FGD?, 
n>1 keZ 


It is proved formally and combinatorially in Comtet’s reference [129, p. 105] and it serves to 
illustrate “proofs from THE BOOK” in the splendid exposition of Aigner and Ziegler [7, §29]. 


Consequently, the numbers { P; hg can be determined in O(N VN) integer operations. | <J 


> 1.20. A digital surprise. Define the constant 
9 99 999 9999 
10 100 1000 10000 
Is it a surprise that it evaluates numerically to 
g = 0.890010099998999000000 1000099999999899999000000000010--- , 


that is, its decimal representation involves only the digits 0, 1, 8, 9? [This is suggested by a note 
of S. Ramanujan, “Some definite integrals”, Messenger of Math. XLIV, 1915, pp. 10-18.] <J 


> 1.21. Lattice points. The number of lattice points with integer coordinates that belong to the 
closed ball of radius n in d-dimensional Euclidean space is 


2 1 oS 155 
[z” 17 @@" where @(@)=1+250 2". 
n=1 


Estimates may be obtained via the saddle-point method (Note VIII.35, p. 589). dq 


I.4. Words and regular languages 


Fix a finite alphabet A whose elements are called letters. Each letter is taken to 
have size 1; i.e., itis an atom. A word’ is any finite sequence of letters, usually written 
without separators. So, for us, with the choice of the Latin alphabet (A = {a,...,z}), 
sequences such as ygololihp, philology, zgrmblglps are words. We denote 
the set of all words (often written as A* in formal linguistics) by W. Following a 
well-established tradition in theoretical computer science and formal linguistics, any 
subset of WV is called a language (or formal language, when the distinction with natural 
languages has to be made). 


8An alternative to the term “word” sometimes preferred by computer scientists is “string”; biologists 
often refer to words as “sequences”. 
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OGF coefficients 
1 

Words: m"” (p. 50) 

1—mz 
a—runs < k re ~ cep” (pp. 51, 308) 
exclude subseq. p Eq. (55) =~ (m—1)"nlPI-! (p54) 

Cp (z) —n 
exclude factor ~c (pp. 61, 271) 
P zIPl + (1 — mz)cp(z) pee cP 

circular Eq. (64) ~m"/n (p. 64) 
regular language _ [rational] =~ C-A™nk (pp. 56, 302, 342) 
context-free lang. [algebraic] = C-AlnP/4 (pp. 80, 501) 


Figure 1.10. Words over an m—ary alphabet: generating functions and coefficients. 


From the definition of the set of words WV, one has 


(51) W = SEQ(A) => W(z) = ——_., 
1—mz 

where m is the cardinality of the alphabet, i.e., the number of letters. The generating 
function gives us the counting result 


W, =m". 


This result is elementary, but, as is usual with symbolic methods, many enumerative 
consequences result from a given construction. It is precisely the purpose of this 
section to examine some of them. 


We shall introduce separately two frameworks that each have great expressive 
power for describing languages. The first one is iterative (i.e., non-recursive) and 
it bases itself on “regular specifications” that only involve the constructions of sum, 
product, and sequence; the other one, which is recursive (but of a very simple form), 
is best conceived of in terms of finite automata and is equivalent to linear systems of 
equations. Both frameworks turn out to be logically equivalent in the sense that they 
determine the same family of languages, the regular languages, though the equiva- 
lence is non-trivial (Appendix A.7: Regular languages, p. 733), and each particular 
problem usually admits a preferred representation. The resulting OGFs are invariably 
rational functions, a fact to be systematically exploited from an asymptotic standpoint 
in Chapter V. Figure 1.10 recapitulates some of the major word problems studied in 
this chapter, together with corresponding approximations”. 


°ln this book, we reserve “~” for the technical sense of “asymptotically equivalent” defined in Ap- 
pendix A.2: Asymptotic notations, p. 722; we reserve the symbol “~” to mean “approximately equal” in 
a vaguer sense, where formulae have been simplified by omitting constant factors or terms of secondary 
importance (in context). 
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1.4.1. Regular specifications. Consider words (or strings) over the binary al- 
phabet A = {a, b}. There is an alternative way to construct binary strings. It is based 
on the observation that, with a minor adjustment at the beginning, a string decomposes 
into a succession of “blocks” each formed with a single b followed by an arbitrary 
(possibly empty) sequence of as. For instance aaabaababaabbabbaaa decomposes 
as 


[aaa] baa | ba | baa |b | ba |b | baaa. 


t!° symbols, we have the alternative decomposition: 


Omitting redundan 


r 
I-z1—274 


(52) W = SEQ(a) x SEQ(b SEQ(a)) => Wi(z)= 


This last expression reduces to (1 — 2z)~! as it should. 

Longest runs. The interest of the construction just seen is to take into account 
various meaningful properties, for example longest runs. Abbreviate by a<* := 
SEQ —;(a) the collection of all words formed with the letter a only and whose length is 
between 0 and k — 1; the corresponding OGFis 1+z+---+z4"! = 1—z*)/(1—2z). 
The collection W'*) of words which do not have k consecutive as is described by an 
amended form of (52): 


Log 1 1— zk 


WwW") = a<* SEQ(ba“*) == W")(z) = = 
a Q(ba ) (z) 1—z ]— zit 1—2z4 241 
ary 


The OGF is in principle amenable to expansion, but the resulting coefficients expres- 
sions are complicated and, in such a case, asymptotic estimates tend to be more usable. 
From the analysis developed in Example V.4 (p. 308), it can indeed be deduced that 
the longest run of a’s in a random binary string of length n is on average asymptotic 
to logy n. 

> 1.22. Runs in arbitrary alphabets. For an alphabet of cardinality m, the quantity 


1— 
1 —mz+(m— 1)zk4+1 
is the OGF of words without k consecutive occurrences of a designated letter. dq 


The case of longest runs exemplifies the utility of nested constructions involving 
sequences. We set: 


Definition 1.10. An iterative specification that only involves atoms (e.g., letters of a 
finite alphabet A) together with combinatorial sums, cartesian products, and sequence 
constructions is said to be a regular specification. 

A language L is said to be S—regular (“specification-regular’”’) if there exists a 
class M described by a regular specification such that £L and M are combinatorially 
isomorphic: L= M. 

An equivalent way of expressing the definition is as follows: a language is S— 
regular if it can be described unambiguously by a regular expression (Appendix A.7: 


10When dealing with words, especially, we freely omit redundant braces “{, }” and cartesian products 
“x”, for readability. For instance, SEQ(a + b) and a b are shorthand for SEQ({a} + {b}) and {a} x {b}. 
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Regular languages, p. 733). The definition of a regular specification and the basic 
admissibility theorem (p. 27) imply immediately: 

Proposition I.2. Any S—regular language has an OGF that is a rational function. 
This OGF is obtained from a regular specification of the language by translating each 
letter into the variable z, disjoint unions into sums, cartesian products into products, 
and sequences into quasi-inverses, (1 —-)~!. 


This result is technically shallow but its importance derives from the fact that 
regular languages have great expressive power devolving from their rich closure prop- 
erties (Appendix A.7: Regular languages, p. 733) as well as their relation to finite 
automata discussed in the next subsection. Examples I.9 and I.10 below make use of 
Proposition I.2 and treat two problems closely related to longest runs. 


Example 1.9. Combinations and spacings. A regular specification describes the set £ of words 
that contain exactly k occurrences of the letter b, from which the OGF automatically follows: 


(53) L = SEQ(a) (b SEQ(a))* => Li) = rae a ys 


Hence the number of words in the language satisfies Lyn = (i). This is otherwise combinat- 
orially evident, since each word of length n is characterized by the positions of its letters b; that 
is, the choice of k positions among n possible ones. Symbolic methods thus give us back the 
well-known count of combinations by binomial coefficients. 

Let (i) ed be the number of combinations of k elements among [1, n] with constrained 
spacings: no element can be at distance d or more from its successor. The refinement of (53) 


(d] Ae n ral = z4jk-1 
LS = SEQ(Q) (b SEQ <a (@)) (bSEQ(a)) => » (‘) a “Goel” 
n>0 <d = 


leads to a binomial convolution expression, 


(LDN 


(This problem is analogous to compositions with bounded summands in (42), p. 43.) What we 
have just analysed is the /argest spacing (constrained to be at most d) in subsets. A parallel 
analysis yields information regarding the smallest spacing. .............. cece cece eee ee | 


Example 1.10. Double run statistics. By forming maximal groups of equal letters in words, 
one finds easily that, for a binary alphabet, 


W = SEQ(b) SEQ(a SEQ(a) b SEQ(b)) SEQ(a). 

Let W'%-) be the class of all words that have at most a consecutive as and f consecutive bs. 
The specification of W induces a specification of WiaB), upon replacing SEQ(a), SEQ(b) by 
SEQ <q (@), SEQ <g (4) internally, and by SEQ<g (a), SEQ<g(b) externally. In particular, the 
OGF of binary words that never have more than r consecutive identical letters is found to be 
(seta = B=1r) 
(54) wir) — 1—27t! ce Sates 

L-2z+27tl l-zg—---— 2 
after simplification. (This result can be extended to an arbitrary alphabet by means of “Smirnov 
words”, Example II.24, p. 204.) 


Révész in [508] tells the following amusing story attributed to T. Varga: “A class of high 
school children is divided into two sections. In one of the sections, each child is given a coin 
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which he throws two hundred times, recording the resulting head and tail sequence on a piece 
of paper. In the other section, the children do not receive coins, but are told instead that they 
should try to write down a ‘random’ head and tail sequence of length two hundred. Collecting 
these slips of paper, [a statistician] then tries to subdivide them into their original groups. Most 
of the time, he succeeds quite well.” 

The statistician’s secret is to determine the probability distribution of the maximum length 
of runs of consecutive letters in a random binary word of length n (here n = 200). The prob- 
ability that this parameter equals k is 


1 kk k-1,k-1 
(vt — wo) 
and is fully determined by (54). The probabilities are then easily computed using any symbolic 
package: for n = 200, the values found are 


k 3 4 5 6 7 8 9 10 11 12 


P(k) | 6.5410-8 7.07104 0.0339 0.1660 0.2574 0.2235 0.1459 0.0829 0.0440 0.0226 


Thus, in a randomly produced sequence of length 200, there are usually runs of length 6 or 
more: the probability of the event turns out to be close to 97% (and there is still a probability of 
about 8% to have arun of length 11 or more). On the other hand most children (and adults) are 
usually afraid of writing down runs longer than 4 or 5 as this is felt as strongly “non-random”. 
The statistician simply selects the slips that contain runs of length 6 or more as the true random 
ones. Vola! si... is, eavieeed ct ieses so baanaes 38 Lobe ltass hug snes od ee MRG Ge See ea Za | 


> 1.23. Alice, Bob, and coding bounds. Alice wants to communicate n bits of information to 
Bob over a channel (a wire, an optic fibre) that transmits 0,1-bits but is such that any occurrence 
of 11 terminates the transmission. Thus, she can only send on the channel an encoded version 
of her message (where the code is of some length € > n) that does not contain the pattern 11. 

Here is a first coding scheme: given the message m = m,m2---mn, where mj; € {0, 1}, 
apply the substitution: 0 +> 00 and 1 +> 10; terminate the transmission by sending 11. This 
scheme has € = 2n + O(1), and we say that its rate is 2. Can one design codes with better 
rates? with rates arbitrarily close to 1, asymptotically? 

Let C be the class of allowed code words. For words of length n, a code of length L = 
L(n) is achievable only if there exists a one-to-one mapping from {0, 1}” into Ut_o Cj, ie., 


a" < Yj-0 Cj. Working out the OGF of C, one finds that necessarily 
14/5 


L(n) > 4n+ O(), A= = 1.440420, g= ; 
logy 9 2 
Thus no code can achieve a rate better than 1.44; i.e., a loss of at least 44% is unavoidable. (For 
this and the next note, see, e.g., MacKay [427, Ch. 17].) <q 


> 1.24. Coding without long runs. Because of hysteresis in magnetic heads, certain storage 
devices cannot store binary sequences that have more than four consecutive Os or more than 
four consecutive 1s. We seek a coding scheme that transforms an arbitrary binary string into a 
string obeying this constraint. 


From the OGF, one finds [z!!]w‘+-*) (z) = 1546 > 210 — 1024. Consequently, a substi- 
tution can be built that translates an original 10-bit word into an 11-bit block that does not have 
five consecutive equal letters. When 11-bit blocks are concatenated, this may however give rise 
to forbidden sequences of identical consecutive letters at the junction of two blocks. It then 
suffices to use “separators” and replace a substituted block of the form a - X - £ by the longer 


block aa - X - BB, where 0 = 1 and 1 = 0. The resulting code has rate iz: 
Extensions of this method show that the rate 1.057 is achievable (theoretically). On the 


other hand, by the principles of the previous note, any acceptable code must use asymptotically 
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at least 1.056n bits to encode strings of n bits. (Hint: let a be the root near } of 1—2a+a> = 0, 
which is a pole of W‘4-4), One has 1/log (1/a) = 1.05621.) <J 


Patterns. There are many situations in the sciences where it is of interest to de- 
termine whether the appearance of a certain pattern in long sequences of observations 
is significant. In a genomic sequence of length 100 000 (the alphabet is A, G, C, T), is 
it or is it not meaningful to detect three occurrences of the pattern TAGATAA, where 
the letters appear consecutively and in the prescribed order? In computer network 
security, certain attacks can be detected by some well-defined alarming sequences of 
events, although these events may be separated by perfectly legitimate actions. On 
another register, data mining aims at broadly categorizing electronic documents in an 
automatic way, and in this context the observation of well-chosen patterns can provide 
highly discriminating criteria. These various applications require determining which 
patterns are, with high probability, bound to occur (these are not significant) and which 
are very unlikely to arise, so that actually observing them carries useful information. 
Quantifying the corresponding probabilistic phenomena reduces to an enumerative 
problem—the case of double runs in Example I.10 (p. 52) is in this respect typical. 

The notion of pattern can be formalized in several ways. In this book, we shall 
principally consider two of them. 


(a) Subsequence pattern: such a pattern is defined by the fact that its letters 
must appear in the right order, but not necessarily contiguously [263]. Sub- 
sequence patterns are also known as “hidden patterns”’. 

(b) Factor pattern: such a pattern is defined by the fact that its letter must appear 
in the right order and contiguously [312, 564]. Factor patterns are also called 
“block patterns” or simply “patterns” when the context is clear. 


For a given notion of pattern, there are then two related categories of problems. First, 
one may aim at determining the probability that a random word contains (or dually, 
excludes) a pattern; this problem is equivalently formulated as an existence problem— 
enumerate all words in which the pattern exists (1.e., occurs) independently of the 
number of occurrences. Second, one may aim at determining the expectation (or even 
the distribution) of the number of occurrences of a pattern in a random text; this prob- 
lem involves enumerating enriched words, each with one occurrence of the pattern 
distinguished. 

Such questions are amenable to methods of analytic combinatorics and in partic- 
ular to the theory of regular specifications and automata: see Example I.11 below for 
a first attempt at analysing hidden patterns (to be continued in Chapter V, p. 315) and 
Example I.12 for an analysis of factor patterns (to be further extended in Chapters II, 
p. 211, IV, p. 271, and IX, p. 659). 


Example 1.11. Subsequence (hidden) patterns in a text. A sequence of letters that occurs 
in the right order, but not necessarily contiguously in a text is said to be a “hidden pattern”. 
For instance the pattern “combinatorics” is to be found hidden in Shakespeare’s Hamlet (Act I, 
Scene 1) 


Dared to the | comb |at;| in | which our vj a [lian} t | Hamlet— 


F| or | so thl i |s side of our known world esteem’d him— 


Did slay this Fortinbras; who by a seal’d| c jompact, 


1.4. WORDS AND REGULAR LANGUAGES 55 


Well ratified by law and heraldry, 
Did forfeit, with hil s | life, all those his lands [...] 


Take a fixed finite alphabet A comprising m letters (m = 26 for English). First, let 
us examine the language CL of all words, also called “texts”, that contain a given word p = 
P1P2°-: pe of length k as a subsequence. These words can be described unambiguously as 
starting with a sequence of letters not containing p; followed by the letter p; followed by a 
sequence not containing pz, and so on: 


L = SEQ(A \ p1)p1 SEQA \ p2)p2 ++: SEQ(A \ px) Pk SEQ(A). 
This is in a sense equivalent to parsing words unambiguously according to the left-most occur- 
rence of p as a subsequence. The OGF is accordingly 


zk 1 


(1 — (m — 1)z)k 1 — mz’ 
An easy analysis of the dominant simple pole at z = 1/m shows that 


1 
L a 
@) zo 1/m 1 — mz 


(55) LQ) = 


; sothat Ly ~ m"”. 
n-00 


Thus, a proportion tending to 1 of all the words of length n do contain a fixed pattern p as a 
subsequence. (Note I.25 below refines this estimate.) 


Mean number of occurrences. A census (Note 1.26, p. 56) shows that there are in fact 
1.63 - 10°? occurrences of “combinatorics” as a subsequence hidden somewhere in the 
text of Hamlet, whose length is 120 057 (this is the number of letters that constitute the text). Is 
this the sign of a secret encouragement passed to us by the author of Hamlet? 

To answer this somewhat frivolous question, here is an analysis of the expected number 
of occurrences of a hidden pattern. It is based on enumerating enriched words, where an en- 
riched word is a word together with a distinguished occurrence of the pattern as a subsequence. 
Consider the regular specification 


O = SEQ(A) p1 SEQ(A) p2 SEQ(A)-- - SEQ(A) pg—1 SEQ(A) pe SEQ(A). 


An element of O is a (2k + 1)-tuple whose first component is an arbitrary word, whose second 
component is the letter p;, and so on, with letters of the pattern and free blocks alternating. In 
other terms, any w € O represents precisely one possible occurrence of the hidden pattern p in 
a text built over the alphabet A. The associated OGF is simply 


zk 
O(z) = ———. 
(2) a- mz)kt+1 
The ratio between the number of occurrences and the number of words of length n then equals 
n O 
(56) 6 = E1ee) 2 e("): 
m” k 


and this quantity represents the expectation of the number of occurrences of p in a random word 
of length n, assuming all such words to be equally likely. For the parameters corresponding to 
the text of Hamlet (1 = 120057) and the pattern “combinatorics” (k = 13), the quantity 
Q,, evaluates to 6.96 - 1037. The number of hidden occurrences observed is thus 23 times 
higher than what the uniform model predicts! However, similar methods make it possible to 
take into account non-uniform letter probabilities (Subsection HI.6.1, p. 189): based on the 
frequencies of letters in the English text itself, the expected number of occurrences is found to 
be 1.71 - 10°°—this is now only within 5% of what is observed. Thus, Shakespeare did not 
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(probably) conceal any message relative to combinatorics—see Example V.7, p. 315, for more 
Onthis TOPIC. doe ets re Sea's 839. eee ok SS a he i See Ee eS | 


> 1.25. A refined analysis. Further consideration of the subdominant pole at z = 1/(m — 1) 
yields, by the methods of Theorem IV.9 (p. 256), the refined estimate: 


L 1\” 
1-2 = 0(n(1- =) ). 
m” m 
Thus, the probability of not containing a given subsequence pattern is exponentially small. < 


> 1.26. Dynamic programming. The number of occurrences of a subsequence pattern in a text 
can be determined efficiently by scanning the text from left to right and maintaining a running 
count of the number of occurrences of the pattern as well as all its prefixes. 


1.4.2. Finite automata. We begin with a simple device, the finite automaton, 
that is widely used in the study of models of computation [189] and has wide descrip- 
tive power with regard to structural properties of words. (A systematic treatment of 
automata and paths in graphs, combining both algebraic and asymptotic aspects, is 
given in Part B, Section V.5, p. 336.) 


Definition I.11. A finite automaton is a directed multigraph whose edges are labelled 
by letters of the alphabet A. It is customary to refer to vertices as states and to denote 
by Q the set of states. One designates an initial state qg € Q and a set of final states 
Qgcgd. 

The automaton is said to be deterministic if for each pair (q, a) with q € Q and 
a € A there exists at most one edge (one also says a transition) starting from q, which 
is labelled by the letter a. 


A finite automaton (Figure I.11) is able to process words, as we now explain. 
A word w = W1...Wpn iS accepted by the automaton if there exists a path in the 
multigraph connecting the initial state go to one of the final states of Q and whose 
sequence of edge labels is precisely w1,..., Wn. For a deterministic finite automaton, 
it suffices to start from the initial state go, scan the letters of the word from left to right, 
and follow at each stage the only transition permitted; the word is accepted if the state 
reached in this way after scanning the last letter of w is a final state. Schematically: 


Fis 


A finite automaton thus keeps only a finite memory of the past (hence its name) and 
is in a sense a combinatorial counterpart of the notion of Markov chain in probability 
theory. In this book, we shall only consider deterministic automata. 

As an illustration, consider the class £ of all words w that contain the pattern 
abb as a factor (the letters of the pattern should appear contiguously). Such words are 
recognized by a finite automaton with four states, go, g1, g2, g3. The construction is 
classical: state g; is interpreted as meaning “the first j characters of the pattern have 
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b a a, b 
a b b 
OO 
Ae) 


Figure I.11. Words that contain the pattern abb are recognized by a four-state au- 
tomaton with initial state gq and final state q3. 


just been scanned’, and the corresponding automaton appears in Figure I.11. The 
initial state is go, and there is a unique final state q3. 


Definition 1.12. A language is said to be A-regular (automaton regular) if it coincides 
with the set of words accepted by a deterministic finite automaton. A class M is A- 
regular if for some regular language L, one has M = CL. 


> 1.27. Congruence languages. The language of binary representations of numbers that are 
congruent to 2 modulo 7 is A-regular. A similar property holds for any numeration base and 
any boolean combination of basic congruence conditions. dq 


> 1.28. Binary representation of primes. The language of binary representations of prime num- 
bers is neither A—regular nor S—regular. [Hint: use the Prime Number Theorem and asymptotic 
methods of Chapter IV.] 

The following equivalence theorem is briefly discussed in Appendix A.7: Regular 
languages, p. 733. 


Equivalence theorem (Kleene—Rabin-Scott). A language is S—regular (specifica- 
tion regular) if and only if it is A-regular (automaton regular). 


These two equivalent notions also coincide with the notion of regularity in for- 
mal language theory, where the latter is defined by means of (possibly ambiguous) 
regular expressions and (possibly non-deterministic) finite automata [6, 189]. As al- 
ready pointed out, the equivalences are non-trivial: they are given by algorithms that 
transform one formalism into the other, but do not transparently preserve combina- 
torial structure (in some cases, an exponential blow-up in the size of descriptions is 
involved). For this reason, we have opted to develop independently the notions of 
S-regularity and A-regularity. 

We next examine the way generating functions can be obtained from a determin- 

istic automaton. The process was first discovered in the late 1950s by Chomsky and 
Schiitzenberger [119]. 
Proposition I.3. Suppose that G is a deterministic finite automaton with state set 
Q = {qo,.--59s}, initial state qo, and set of final states O= 1 Gi nar eecs qi,}- The 
generating function of the language L of all words accepted by the automaton is a 
rational function that is determined under matrix form as 


L(z) =u — zT)!v. 
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Here the transition matrix T is defined by 
Tj.« = card {a € A such that an edge (qj, qx) is labelled by a} ; 


the row vector wis the vector (1,0,0,..., 0) and the column vector vV = (v9, ..., Ds)! 
is such that!! vj =Ilq; € QI. 


In particular, by Cramer’s rule, the OGF of a regular language is the quotient of two 
(sparse) determinants whose structure directly reflects the automaton transitions. 


Proof. The proof we present is based on a “first-letter decomposition”, which is 
conceptually analogous to the Kolmogorov backward-equations of Markov chain the- 
ory [93, p. 153]. (Note I.29 provides an alternative approach.) For j € {0,..., 5}, in- 
troduce the class (language) £; of all words w such that the automaton, when started 
in state g;, terminates in one of the final states of Q, after having read w. The follow- 
ing relation holds for any /: 


(57) Lye Ajt+ (Stereo) 


acA 


there A; is the class {€} formed of the word of length 0 if q; is final and the empty 
set (0) otherwise; the notation (qj o a) designates the state reached in one step from 
state qj; upon reading letter a. The justification is simple: a language £; contains the 
word of length 0 only if the corresponding state q; is final; a word of length > 1 that 
is accepted starting from state q; has a first letter a followed by a word that must lead 
to an accepting state, when starting from state qj oa. 

The translation of (57) is then immediate: 


(58) Lj(@) = [qj € QN+z >. Lejoay(z). 

acA 
The collection of all the equations as j varies forms a linear system: with L(z) the 
column vector (Lo(z),..., Ls(z)), one has 


L(z) =v+2T Li), 


where v and T are as described in the statement. The result follows by matrix inversion 
upon observing that the OGF of the language CL is Lo(z). a 
> 1.29. The forward equations. Let Mx be the set of words, which lead to state g,, when the 
automaton is started in state gg. By a “last-letter decomposition”, the M, satisfy a system that 
is a transposed version of (58). 

The pattern abb. Consider the automaton recognizing the pattern abb as given 
in Figure I.11. The languages £; (where £; is the set of accepted words when starting 
from state qg;) are connected by the system of equations 


Lo = al, +bLo 
Ly = aly +bLo 
Lo = aly +)L3 
L3 = al3 +b£L3 +6, 


‘Nye proves convenient at this stage to introduce Iverson’s bracket notation: for a predicate P, the 
quantity [[P]] has value 1 if P is true and 0 otherwise. 
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which directly reflects the graph structure of the automaton. This gives rise to a set of 
equations for the associated OGFs 


Io = zl, +2zLo 
Ly = zlhy +2zL2 
ly = zl, +2L3 
Lz = zl3 +2L3 +1. 


Solving the system, we find the OGF of all words containing the pattern abb: it is 
Lo(z) since the initial state of the automaton is go, and 


23 


(1 —z)(1 — 2z)(1 —z — 22) 
The partial fraction decomposition 


1 24+2z 1 
Lo(z) = Ta 


(59) Lo(z) = 


2% 1l-z-2' 1=z 
then yields 
Lon = 2" — Fr43 +1, 


with F,, a Fibonacci number (p. 42). In particular the number of words of length 7 that 
do not contain abb is F,43 —1, a quantity that grows at an exponential rate of g”, with 
y = (1+ V5)/2 the golden ratio. Thus, all but an exponentially vanishing proportion 
of the strings of length n contain the given pattern abb, a fact that was otherwise to 
be expected on probabilistic grounds. (For instance, from Note 1.32, p. 61, a random 
word contains a large number, about ~ n/8, of occurrences of the pattern abb.) 

> 1.30. Regular specification for pattern abb. The pattern abb is simple enough that one can 
come up with an equivalent regular expression describing £9, whose existence is otherwise 
granted by the Kleene—Rabin—Scott Theorem. An accepting path in the automaton of Fig- 
ure I.11 loops around state 0 with a sequence of b, then reads an a, loops around state 1 with 
a sequence of a’s and moves to state 2 upon reading a b; then there should be letters making 
the automaton passs through states 1-2-1-2-----1-2 and finally a b followed by an arbitrary 


sequence of as and bs at state 3. This corresponds to the specification (with X* abbreviating 
SEQ(X)) 


3 
Lo = (b)* a(a)*b (a(a)*b)* ba +b) = > Lo(z) = : 5 : 
(1-2)? — =) - 22) 
which gives back a form equivalent to (59). dq 
Example 1.12. = Words containing or excluding a pattern. Fix an arbitrary pattern p = 


P1P2--: px and let L be the language of words containing at least one occurrence of p as 
a factor. Automata theory implies that the set of words containing a pattern as a factor is A— 
regular, hence admits a rational generating function. Indeed, the construction given for p = abb 
generalizes in an easy manner: there exists a deterministic finite automaton with k + 1 states 
that recognizes L, the states memorizing the largest prefix of the pattern p just seen. As a con- 
sequence: the OGF of the language of words containing a given factor pattern of length k is a 
rational function of degree at most k + 1. (The corresponding automaton is in fact known as a 
Knuth—Morris—Pratt automaton [382].) The automaton construction however provides the OGF 
L(z) in determinantal form, so that the relation between this rational form and the structure of 
the pattern is not transparent. 


60 I, COMBINATORIAL STRUCTURES AND ORDINARY GENERATING FUNCTIONS 


Autocorrelations. An explicit construction due to Guibas and Odlyzko [313] nicely cir- 
cumvents this problem. It is based on an “equational” specification that yields an alternative 
linear system. The fundamental notion is that of an autocorrelation vector. For a given p, this 
vector of bits c = (cg, ..., Ck—1) is most conveniently defined in terms of Iverson’s bracket as 


ci = Pi41 Pit2-+: Pk = P1P2-++ Pe-ill. 


In other words, the bit c; is determined by shifting p right by 7 positions and putting a 1 if 
the remaining letters match the original p. Graphically, c; = 1 if the two framed factors of p 
coincide in 


p= Pi-: Pi | Pitl *** Pk 


v 
= 

Ill 
7s 


P\ °** Pk-i| Pk-i+1 °° 


For instance, with p = aabbaa, one has 


aabb 


aabb 
aab 
aa 
a 


garras]s 


RR OCOrF 


a 
a 
baa 
bbaa 

. The autocorrelation polynomial is defined as 


c(z) := Ss. ez). 


j=0 


For the example pattern, this gives c(z) = 1+ 2442. 

Let S be the language of words with no occurrence of p and T the language of words that 
end with p but have no other occurrence of p. First, by appending a letter to a word of S, one 
finds a non-empty word either in S or T, so that 


(60) S+T={E4+SxA. 


Next, appending a copy of the word p to a word in S may only give words that contain p at or 
“near” the end. In precise terms, the decomposition based on the left-most occurrence of p in 


Sp is 


(61) Sx {p}=T x Do (pein Pk—i42-* Pads 
cj 40 
corresponding to the configurations 
s [IIHT 
ITITLTB// 1111 || Pk-i+1 * + * Pk 
T 


The translation of the system (60), (61) into OGFs then gives a system of two equations in the 
two unknowns S, 7, 


S+T=1+mzS, S- zk = Tc(z), 


which is then readily solved. 
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Proposition I.4. The OGF of words not containing the pattern p as a factor is 
c(Z) 
zk + (1 — mz)c(z)’ 
where m is the alphabet cardinality, k = |p| the pattern length, and c(z) the autocorrelation 
polynomial of p. 


(62) SZ) = 


A bivariate generating function based on the autocorrelation polynomial is derived in 
Chapter III, p. 212, from which is deduced, in Proposition [X.10, p. 660, the existence of a 
limiting Gaussian law for the number of occurrences of any pattern. ..................04. | 


> 1.31. At least once. The GFs of words containing at least once the pattern (anywhere) and 
containing it only once at the end are 


zk ra 


G—mace+C—mac@y 9 > F4 0 —mac@’ 
respectively. dq 


L(z) = 


> 1.32. Expected number of occurrences of a pattern. For the mean number of occurrences 
of a factor pattern, calculations similar to those employed for the number of occurrences of 
a subsequence (even simpler) can be based on regular specifications. All the occurrences 
(contexts) of p = p, p2--- px as a factor are described by 


k 
pe ~ Zz 
O = SEQ(A) (P1p2--+ pk) SEQA), => O(%) = -——,,. 
(1 — mz) 
Consequently, the expected number of such contiguous occurrences satisfies 
(63) On =m*n—k+ I~ 4. 
m 

Thus, the mean number of occurrences is proportional to n. dq 


> 1.33. Waiting times in strings. Let LC SEQ{a, b} be a language and S = {a, b}™ be the set 
of infinite strings with the product probability induced by P(a) = P(b) = 5. The probability 
that a random string w € S starts with a word of L is Ld /2), where L(z) is the OGF of the 
“prefix language” of L, that is, the set of words w € CL that have no strict prefix belonging to L. 
The GF L(z) serves to express the expected time at which a word in C is first encountered: this 
is 4 (4). For a regular language, this quantity must be a rational number. dq 


> 1.34. A probabilistic paradox on strings. In a random infinite sequence, a pattern p of length k 
first occurs on average at time 2*c(1/2), where c(z) is the autocorrelation polynomial. For 
instance, the pattern p = abb tends to occur “sooner” (at average position 8) than p’ = aaa (at 
average position 14). See [313] for a thorough discussion. Here are for instance the epochs at 
which p and p’ are first found in a sample of 20 runs: 


p: 3,4,5,5, 6,6, 7,8, 8,8, 8,9,9, 10, 11, 14,15, 15, 16, 21 
p’: 3,4, 8, 8,9, 10,11, 11, 11, 12, 17, 22, 23, 27, 27, 27, 44, 47, 52, 52. 


On the other hand, patterns of the same length have the same expected number of occurrences, 
which is puzzling. Is analytic combinatorics contradictory? (Hint. The catch is that, due to 
overlaps of p’ with itself, occurrences of p’ tend to occur in clusters, but, then, clusters tend to 
be separated by wider gaps than for p; eventually, there is no contradiction.) dd 


> 1.35. Borges’s Theorem. Take any fixed finite set II of patterns. A random text of length n 
contains all the patterns of the set II (as factors) with probability tending to 1 exponentially 
fast as n — oo. Reason: the rational functions S(z/2) with S(z) as in (62) have no pole 
in |z| < 1; see also Chapters III (p. 213), IV(p. 271), V(p. 308). This property is sometimes 
called “Borges’s Theorem” as a tribute to the famous Argentinian writer Jorge Luis Borges 
(1899-1986) who, in his essay “The Library of Babel”, describes a library so huge as to contain: 
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“Everything: the minutely detailed history of the future, the archangels’ autobiogra- 
phies, the faithful catalogues of the Library, thousands and thousands of false cat- 
alogues, the demonstration of the fallacy of those catalogues, the demonstration of 
the fallacy of the true catalogue, the Gnostic gospel of Basilides, the commentary 
on that gospel, the commentary on the commentary on that gospel, the true story of 
your death, the translation of every book in all languages, the interpolations of every 
book in all books.” 


Strong versions of Borges’s Theorem, including the existence of limit Gaussian laws, hold for 
many random combinatorial structures, including trees, permutations, and planar maps (see 
Chapter IX, p. 659 and pp. 680-684). 


> 1.36. Variable length codes. A finite set F C W, where W = SEQ(A) is called a code if any 
word of VW decomposes in at most one manner into factors that belong to F (with repetitions 
allowed). For instance F = {a,ab, bb} is a code and aaabbb = ala|ab|bb has a unique 
decomposition; F’ = {a, aa, b} is not a code since aaa = a|aa = aa|a = alala. The OGF of 
the set Sz of all words that admit a decomposition into factors all in F is a computable rational 
function, irrespective of whether F is a code. (Hint: use an “Aho—Corasick” automaton [5].) A 


finite set F is a code iff S¢(z) = (1-— F (z))7!. Consequently, the property of being a code 
can be decided in polynomial time using linear algebra. The book by Berstel and Perrin [55] 
develops systematically the theory of such variable-length codes. dq 


In general, automata are useful in establishing a priori the rational character of 
generating functions. They are also surrounded by interesting analytic properties (e.g., 
Perron—Frobenius theory, Section V.5, p. 336, that characterizes the dominant poles) 
and by asymptotic probability distributions of associated parameters that are normally 
Gaussian. They are most conveniently used for proving existence theorems, then sup- 
plemented when possible by regular specifications, which are likely to lead to more 
tractable expressions. 


1.4.3. Related constructions. Words can, at least in principle, encode any com- 
binatorial structure. We detail here one situation that demonstrates the utility of such 
encodings: it is relative to set partitions and Stirling numbers. The point to be made is 
that some amount of “combinatorial preprocessing” is sometimes necessary in order 
to bring combinatorial structures into the orbit of symbolic methods. 


Set partitions and Stirling partition numbers. A set partition is a partition of a 
finite domain into a certain number of non-empty sets, also called blocks. For instance, 
if the domain is D = {a, £, y, 6}, there are 15 ways to partition it (Figure I.12). Let 
§? denote the collection of all partitions of the set [1 ..n] into r non-empty blocks 
and so a card(S””) the corresponding cardinality. The basic object under consid- 
eration here is a set partition (not to be confused with integer partitions considered 
earlier). 

It is possible to find an encoding of partitions in § of an n-set into r blocks by 
words over ar letter alphabet, 6 = {b,, bo, ..., b-} as follows. Consider a set partition 
w that is formed of r blocks. Identify each block by its smallest element called the 
block leader; then sort the block leaders into increasing order. Define the index of 
a block as the rank of its leader among all the r leaders, with ranks conventionally 
starting at 1. Scan the elements | to n in order and produce sequentially n letters from 
the alphabet B: for an element belonging to the block of index j, produce the letter b ;. 
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aBlyod 

alplyo 
ay |Bo 

aly|Bo 
ao|py 

a \o|Py 

apyo! |ja|Byo alPly lo 

Bly lao 
Blayo 

Blolay 
ylapo 

ylolap 
dlapy 


Figure I.12. The 15 ways of partitioning a four-element domain into blocks corres- 


pond to s =1, Se atin s®) =6, sO ey 


For instance for n = 6, r = 3, the set partition w = {{6, 4}, {5, 1, 2}, {3, 7, 8}}, 
is reorganized by putting leaders in first position of the blocks and sorting them, 
by bo b3 
epee tt cient tl uate 
w = {{1, 2, 5}, {3, 7, 8}, {4, o}}, 


so that the encoding is 
( 1234567 8 
by by bo b3 by b3 bz bz . 
In this way, a partition is encoded as a word of length n over B with the additional 
properties that: (i) all r letters occur; (ii) the first occurrence of b; precedes the first 
occurrence of b2, which itself precedes the first occurrence of b3, etc. Graphically, 
this correspondence can be rendered by an “irregular staircase” representation, such 
as 
ee eae 
Bs Be, Se 
Te ae Bee 
where the staircase has length n and height r, each column contains exactly one ele- 
ment, each row corresponds to a class in the partition. 
From the foregoing discussion, 5” is mapped into words of length n in the lan- 
guage 
by, SEQ(b1)- bz SEQ(b; +b2)-b3 SEQ(D] +b2 +53) --+ by SEQ(bi +b2+---+b;). 


The language specification immediately gives the OGF 


Zz 


I —z)( —2z)0 _ 3z)---C —rz) 


The partial fraction expansion of $“)(z) is then readily computed, 


) Lo (r\ Ci mols -i(" 
SO?) = => ()\— so that S” = iy i(")a. 
I= 


j=o J 


SO@= 
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In particular, one has 
1 1 
so =on s® = Ta Oy. s8 _ ae eee RS Le 3). 


These numbers are known as the Stirling numbers of the second kind, or better, as 
the Stirling partition numbers, and the so? are nowadays usually denoted by {"}; 
see Appendix A.8: Stirling numbers, p. 735. 

The counting of set partitions could eventually be done successfully thanks to an 
encoding into words, and the corresponding language forms a constructible class of 
combinatorial structures (indeed, a regular language). In the next chapter, we shall 
examine a flexible approach to the counting of set partitions that is based on labelled 
structures and exponential generating functions (Subsection II. 3.1, p. 106). 


Circular words (necklaces). Let A be a binary alphabet, viewed as comprised 
of beads of two distinct colours. The class of circular words or necklaces (Note I.1, 
p. 18, and Equation (20), p. 26) is defined by a CYC composition: 


of), 1 
ae EG 


(64) N = Cyc(A) = Ag= >" 
k=1 


The series starts as (ETS A000031) 
N(z) = 2z + 327 + 427 + 624 + 82° + 142° + 202’ + 362% + 602? +--+, 
and the OGF can be expanded: 


1 

(65) Nn =— Dd ok)2"/*. 

It turns out that NV, = D, + 1 where D, is the wheel count, p. 47. [The connection is 
easily explained combinatorially: start from a wheel and repaint in white all the nodes 
that are not on the basic circle; then fold them onto the circle.] The same argument 
proves that the number of necklaces over an m—ary alphabet is obtained by replacing 2 
by m in (65). 

> 1.37. Finite languages. Viewed as a combinatorial object, a finite language 4 is a set of 


distinct words, with size being the total number of letters of all words in 2. For a binary alphabet, 
the class of all finite languages is thus 


Sl ace 
FL = PSET(SEQs(A)) => FL(z) = exp > ak 
7 A k 1—2z 


The series is (EIS A102866) 1 + 2z + 5z? + 1623 +. 4224 4 11629 + 31025 +--+. <J 


I.5. Tree structures 


This section is concerned with basic tree enumerations. Trees are, as we saw 
already, the prototypical recursive structure. The corresponding specifications nor- 
mally lead to nonlinear equations (and systems of such) over generating functions, the 
Lagrange inversion theorem being exactly suited to solving the simplest category of 
problems. The functional equations furnished by the symbolic method can then con- 
veniently be exploited by the asymptotic theory of Chapter VII (pp. 452-482). As we 
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Specification OGF coefficient 
Trees: 
1 1 (2n —2 qn} 
plane general G= Z x SEQ(G) ~(1-— V1 -—4z) ia (a ~~ 
2 n\n-1 an 
1 1 2 4” 
— binary B=1+ZxBxB —(1-V1—42) (7) ~ 
2z n+1\n mn? 


— simple T=ZxSEQQ(T) T(z) =2(T(2)) ~~ cp" n 73/7 
non-plane gen. H = Z x MSET(H) H(z) = zExp(A(z)) ~ d- B"/n3/2 
— binary U = Z+MSET)(U) Eq. (76), p. 72 ~ Ag > BR /n3/? 


— simple V=ZMSETQ(V) Eq. (73), p. 71 ~ ép—"n-3/? 


Figure I.13. Rooted trees of type either plane or non-plane and asymptotic forms. 
There, A = 0.43992, 6 = 2.95576; 42 = 0.31877, fo = 2.48325. References for 
asymptotics are pp. 452-482 of Chapter VII. 


shall see there, a certain type of analytic behaviour appears to be “universal” in trees, 
namely the occurrence of a J -singularity; accordingly, most tree families arising in 
the combinatorial world have counting sequences obeying a universal asymptotic form 
C A"n~?/?, which widely extends what we obtained elementarily for Catalan numbers 
on p. 38. A synopsis of what awaits us in this section is given in Figure I.13. 


1.5.1. Plane trees. Trees are commonly defined as undirected acyclic connected 
graphs. In addition, the trees considered in this book are, unless otherwise specified, 
rooted (Appendix A.9: Tree concepts, p. 737 and [377, §2.3]). In this subsection, we 
focus attention on plane trees, also sometimes called ordered trees, where subtrees 
dangling from a node are ordered between themselves. Alternatively, these trees may 
be viewed as abstract graph structures accompanied by an embedding into the plane. 
They are precisely described in terms of a sequence construction. 

First, consider the class G of general plane trees where all node degrees are al- 
lowed (this repeats material on p. 35): we have 


Zz 
66 = Zx SE => G(z) = ———., 
(66) G a9) O76 
1-V1-4 

and, accordingly, G(z) = — so that the number of general trees of size n 
is a shifted Catalan number: 

1 /2n —2 
(67) Gn = Cn-1 =a : 

n\n-1 


Many classes of trees defined by all sorts of constraints on properties of nodes 
appear to be of interest in combinatorics and in related areas such as formal logic and 
computer science. Let Q be a subset of the integers that contains 0. Define the class 
T° of Q-restricted trees as formed of trees such that the outdegrees of nodes are 
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constrained to lie in Q. In what follows, an essential rdle is played by a characteristic 
function that encapsulates Q, 
p(u) = >i u®. 


acQ 
Thus, Q = {0, 2} determines binary trees, where each node has either 0 or 2 descen- 
dants, so that #(u) = 1 + u?; the choices Q = {0, 1,2} and Q = {0,3} determine, 
respectively, unary—binary trees (¢(u) = 1+u-+u7) and ternary trees ((u) = 1+u°); 
the case of general trees corresponds to Q = Zso and g(u) = (1 —u)7!. 
Proposition 1.5. The ordinary generating function T®(z) of the class T° of Q- 
restricted trees is determined implicitly by the equation 


T°(z) =z (T(z), 


where @ is the characteristic of Q, namely ¢(u) := Se u®. The tree counts are 
given by 

1 
(68) Tp = ("ITP @) = — [wo w)”. 

n 


A class of trees whose generating function satisfies an equation of the form y = 
z(y(z)) is also called a simple variety of trees. The study of such families (in the 
unlabelled and labelled cases alike) is one of the recurrent themes of this book. 


Proof. Clearly, for Q—restricted sequences, we have 
A = SEQo(B) = A(z) = $(B()), 
so 
T°? = Z x SEQQ(T”) = T(z) = 26(T°()). 
This shows that T = T° is related to z by functional inversion: 
T 


On 
The Lagrange Inversion Theorem precisely provides expressions for such a case (see Ap- 
pendix A.6: Lagrange Inversion, p. 732 for an analytic proof and Note I.47, p. 75, for 
combinatorial aspects): 


Lagrange Inversion Theorem. The coefficients of an inverse function and of all its 
powers are determined by coefficients of powers of the direct function: if z = T/d(T), 
then one has (with any k € Zs): 


1 k 
(69) "IT@) = =[w"™ "dw", 2"IT@* = —[w"“"14 (w)". 
The theorem immediately implies (68). a 


The form relative to powers T* in (69) is known as “Biirmannn’s form” of La- 
grange inversion; it yields the counting of (ordered) k—forests, which are k-sequences 
of trees. Furthermore, the statement of Proposition 1.5 extends trivially to the case 
where Q is a multiset; that is, a set of integers with repetitions allowed. For instance, 
Q = {0, 1, 1, 3} corresponds to unary—ternary trees with two types of unary nodes, 
say, having one of two colours; in this case, the characteristic is #(u) = u9+2u!+u?. 
The theorem gives back the enumeration of general trees, where ¢(u) = (1—u)7!, by 
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estes! ee 


Figure I.14. A general tree of G5, (left) and a binary tree of 5B = Bos (right) 
drawn jg at random among the Csg and C45 possible eee epecevely with 


1 /2n 
Cn att es ), the nth Catalan number. 


way of the binomial theorem applied to (1 — u)~”. In general, it implies that, when- 
ever Q comprises r elements, Q = {@1,..., @,}, the tree counts are expressed as an 
(r — 1)-fold summation of binomial coefficients (use the multinomial expansion). An 
important special case detailed in the next two examples below is when Q has only 
two elements. 


Example 1.13. Binary trees and Catalan numbers. A binary tree is a rooted plane tree, in 
which every node has either 0 or 2 successors (Figure I.14). In this case, it is customary to 
consider size to be the number of internal “branching” nodes, and we shall do so in most of the 
analyses to come. (By elementary combinatorics, if such a tree has v internal nodes, it has v + 1 
external nodes, hence it comprises 2v + 1 nodes in total.) The specification and OGF of the 
class B of binary trees are then 


B=14+(ZxBxB) = Bi) =14+zB()? 
(observe the structural analogy with triangulations in (31), p. 36), so that 
1-—J/1-4z 1 2n 
B(z) = ————— d Bn = 
@) 2z a "n+l ( n ), 


again a Catalan number (with a shift of index when compared to (67)). In summary: 


The number By of plane binary trees having n internal nodes, i.e., (n + 1) external nodes 


and (2n + 1) nodes in total, is the Catalan number By = Cy = tr @). 


If one considers all nodes, internal and external alike, as contributing to size, the corres- 
ponding specification and OGF become 


B=2+(2xBxB) = Be) =2(1+ BO), 


and the Lagrangean form is recovered (as well as Bons = By), with du) = 1+ u’). 

Alternatively, consider the class B of pruned binary trees, which are binary trees stripped 
of their external nodes (Appendix A.9: Tree concepts, p. 737), where only trees in B \ Bg are 
taken. The corresponding class B satisfies (upon distinguishing left- and right-branching unary 
nodes of the pruned tree) 


B=2+(ZxB+(2xB+(ZxBxB = Bw) =z (1+ BO)’ 


which is now Lagrangean with ¢(u) = (1 +). These calculations, all with a strongly similar 
flavour, are explained by natural bijections in Subsection I. 5.3, p. 73. ...............0 20 | 
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> 1.38. Forests. Consider ordered k—forests of trees defined by F = SEQ;(T). The general 
form of Lagrange inversion implies 


k 
—[ 


[2"] F(z) = [z"IT@)* = : u"—*) bu)". 


In particular, one has for forests of general trees (6(u) = (1 — u)—!): 


en —*) oe) 
Zz = 3 


2 n n—-1 
the coefficients are also known as “ballot numbers”. <i 


Example 1.14. “Regular” (t—ary) trees. A tree is said to be t—-regular or t—ary if Q consists 
only of the elements {0, t} (the case t = 2 gives back binary trees). In other words, all internal 
nodes have degree t exactly. Let A := 710.9), In this case, the characteristic is o(u) =1+u! 
and the binomial theorem combined with the Lagrange inversion formula gives 


1 
An = —(u" +a)" 
= (041) provided n = 1 mod tf. 


As the formula shows, only trees of total size of the form n = tv + 1 exist (a well-known fact 
otherwise easily checked by induction), and 


(70) 4 _ ol tu+1\_ 1 tv 
UE py ted v ~ (t-lvtlloy 


As in the binary case, there is a variant of the determination of (70) that avoids congruence 
restrictions. Define the class A of “pruned” trees as trees of A \ Ao deprived of all their 
external nodes. The trees in A now have nodes that are of degree at most t. In order to make 
A bijectively equivalent to A , it suffices to regard trees of A as having (5) possible types of 


nodes of degree j, for any j € [0, t]: each node type in A plainly encodes which of the original 
t — j subtrees have been pruned. With Q now being a multiset, we find d(u) = (1 + uw)! and 
A(z) = z@(A(z)), so that, by Lagrange inversion, 


= 1f tv 1 tv 
Ay ee yas eRe. eA > 
viv-1 (@—-—1vt+1\v 
yet another form of (70), since Ay = Apyp]. cece cece cece eect cece eee e en eeneeeeeeeees | 


> 1.39. Unary—-binary trees and Motzkin numbers. Let M be the class of unary—binary trees: 
1—z—v1—2z— 322 
2z 


One has M(z) = z+ 224273 44744929 42175 +5127 +.--. The coefficients M, = 
[z”]M(z), known as Motzkin numbers (E/S A001006), are given by 


uO) 


as a consequence of the Lagrange Inversion Theorem. dq 


M = Z x SEQ<2(M) => M(z) = 


> 1.40. Yet another variant of t—ary trees. Let A be the class of t—ary trees, but with size now 
defined as the number of external nodes (leaves). Then, one has 


A = Z + SEQ;(A). 


I. 5. TREE STRUCTURES 69 


The binomial form of A, follows from Lagrange inversion, since A= z/A- At 1), Can this 
last relation be interpreted combinatorially? dq 


Example 1.15. Hipparchus of Rhodes and Schréder. Yn 1870, the German mathematician Ernst 
Schréder (1841-1902) published a paper entitled Vier combinatorische Probleme. The paper 
had to do with the number of terms that can be built out of m variables using non-associative 
operations. In particular, the second of his four problems asks for the number of ways a string 
of n identical letters, say x, can be “bracketed”. The rule is best stated recursively: x itself is a 
bracketing and if 01, 02,..., 0% with k > 2 are bracketed expressions, then the k—ary product 
(0102 --- ox) is a bracketing. For instance: (((x x)x(x xx))((x x)(x x)x)). 

Let S denote the class of all bracketings, where size is taken to be the number of variable 
instances. Then, the recursive definition is readily translated into the formal specification (with 
Z representing x) and the OGF equation: 


S(z)? 
1— S(z) 
Indeed, to each bracketing of size n is associated a tree whose external nodes contain the vari- 
able x (and determine size), with internal nodes corresponding to bracketings and having degree 
at least 2 (while not contributing to size). 

The functional equation satisfied by the OGF is not a priori of the type correspond- 
ing to Proposition I.5, because not all nodes contribute to size in this particular application. 
Note I.41 provides a reduction to Lagrangean form; however, in a simple case like this, the 
quadratic equation induced by (71) is readily solved, giving 


S@ = r(i+z-vi-6+2) 
4 


z+ 22432 4 1124 4.4525 + 19726 + 90327 + 427928 + 2079329 
+ 103049z!9 + 518859z!1 4... , 


(71) S = Z+SEQs2(S) => S(z)=z+ 


where the coefficients are EJS A001003. (These numbers also count series—parallel networks of 
a specified type (e.g., serial in Figure I.15, bottom), where placement in the plane matters.) 

In an instructive paper, Stanley [553] discusses a page of Plutarch’s Moralia where there 
appears the following statement: 


“Chrysippus says that the number of compound propositions that can be made from 
only ten simple propositions exceeds a million. (Hipparchus, to be sure, refuted this 
by showing that on the affirmative side there are 103 049 compound statements, and 
on the negative side 310 952.)” 


It is notable that the tenth number of Hipparchus of Rhodes!? (c. 190-120BC) is precisely 
S19 = 103049. This is, for instance, the number of logical formulae that can be formed from 
ten boolean variables x1, ..., x19 (used once each and in this order) using and-or connectives in 
alternation (no “negation’”), upon starting from the top in some conventional fashion! e. g, with 


12This was first observed by David Hough in 1994; see [553]. In [315], Habsieger er al. further note 
that i(S 10 + S11) = 310 954, and suggest a related interpretation (based on negated variables) for the other 
count given by Hipparchus. 

13 any functional term admits a unique tree representation. Here, as soon as the root type has been 
fixed (e.g., an A connective), the others are determined by level parity. The constraint of node degrees > 2 
in the tree means that no superfluous connectives are used. Finally, any monotone boolean expression can 
be represented by a series—parallel network: the x; are viewed as switches with the true and false values 
being associated with closed and open circuits, respectively. 
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(x1) A G2 V (43 A x4 A x5) V x6) A ((X7 A X8) V (x9 A X10)) 


en | [ee 
weer i Fase ae %, 
ra IN X5 Z . Z . 


x2 x7 X8 
> XxX] X3 X4 X5 ——>- 
x6 x9 *10 


Figure I.15. An and-or positive proposition of the conjunctive type (top), its associ- 
ated tree (middle), and an equivalent planar series—parallel network of the serial type 
(bottom). 


an and-clause; see Figure I.15. Hipparchus was naturally not cognizant of generating functions, 
but with the technology of the time (and a rather remarkable mind!), he would still be able to 
discover a recurrence equivalent to (71), 


(72) Sn =n > 2] Sup Sny 0+ Sng) + = 1, 
ny+e+ngSn 


where the sum has only 42 essentially different terms for n = 10 (see [553] for a discussion), 
and finally determine S19. 2.0.0... eee cece cence en enn n tenn een e nee |_| 
> 141. The Lagrangean form of Schréder’s GF. The generating function S(z) admits the form 


= 
S(z) = zb(S(z)) where ¢(y) = ; 5 


is the OGF of compositions. Consequently, one has 


1 l—u\" 
S == n—-l 
" Al 1(p=5) 


ae ef BA\TRPENMY 2 TES en 82) fi 
on 2) (esa) k )- = n-1 ) k ) 


k=0 


Is there a direct combinatorial relation to compositions? dq 


> 1.42. Faster determination of Schréder numbers. By forming a differential equation satisfied 
by S(z) and extracting coefficients, one obtains a recurrence 


(2 +2)Sn42 —32n+ DSpp1+(2—VSn=0, n> 1, 


that entails a fast determination, in linear time, of the S,. (This technique, which originates 
with Euler [199], is applicable to any algebraic function; see Appendix B.4: Holonomic func- 
tions, p. 748.) In contrast, Hipparchus’s recurrence (72) implies an algorithm of complexity 
exp(O(./n)) in the number of arithmetic operations involved. dq 
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1.5.2. Non-plane trees. An unordered tree, also called non-plane tree, is just 
a tree in the general graph-theoretic sense, so that there is no order between subtrees 
emanating from a common node. The unordered trees considered here are furthermore 
rooted, meaning that one of the nodes is distinguished as the root. Accordingly, in the 
language of constructions, a rooted unordered tree is a root node linked to a multiset 
of trees. Thus, the class H of all unordered trees, admits the recursive specification: 


H@ =z] [@-2")* 
(73) H= Zx MSET(H) => m=1 


= zexp(H@)+ $H@)+--). 


The first form of the OGF was given by Cayley in 1857 [67, p. 43]; it does not admit 
a closed form solution, although the equation permits one to determine all the H,, 
recursively (EJS A000081): 


H(z) =z +27 +223 +424 4.927 + 2026 + 4827 + 11528 + 28627 +--+. 
The enumeration of the class of trees defined by an arbitrary set QO of node degrees 


immediately results from the translation of sets of fixed cardinality. 


Proposition 1.6. Let Q c N be a finite set of integers containing 0. The OGF U (z) of 
non-plane trees with degrees constrained to lie in Q satisfies a functional equation of 
the form 


(74) U(z) = 2®UG@), Ue’), UG"), ...), 
for some computable polynomial ®. 


Proof. The class of trees satisfies the combinatorial equation, 


U = Z x MSETQ() (seraen = > ser), 


acEQ 


where the multiset construction reflects non-planarity, since subtrees stemming from 
a node can be freely rearranged between themselves and may appear repeated. An- 
ticipating on what we shall see later, we note that Theorem I.3 (p. 84) provides the 
translation of MSET;(V/): 


2 
®(U (2), U(z*), U(z?),...) = > [u®] o0( Fv + sue) ae ) 


acQ 
The statement then follows immediately. | 


In the area of non-plane tree enumerations, there are no explicit formulae but only 
functional equations implicitly determining the generating functions. However, as we 
shall see in Section VIL. 5 (p. 475), the equations may be used to analyse the dominant 
singularity of U(z). We shall find that a “universal” law governs the singularities of 
simple tree generating functions, either plane or non-plane (Figure 1.13): the singu- 
larities are of the general type ./1 — z/p, which, by singularity analysis, translates 
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into 


(Ba)” 
Vn3 


Many of these questions have their origin in enumerative combinatorial chemistry, a 
subject started by Cayley in the nineteenth century [67, Ch. 4]. Pdlya re-examined 
these questions, and, in his important paper [488] published in 1937, he developed 
at the same time a general theory of combinatorial enumerations under group actions 
and systematic methods giving rise to estimates such as (75). See the book by Harary 
and Palmer [319] for more on this topic or Read’s edition of Pélya’s paper [491]. 


(75) UP ~ da 


> 1.43. Fast determination of the Cayley—Pélya numbers. Logarithmic differentiation of H(z) 
provides for the H,, a recurrence by which one computes H,, in time polynomial in n. (Note: a 
similar technique applies to the partition numbers P,; see p. 42.) dq 


> 1.44. Binary non-plane trees. Unordered binary trees V, with size measured by the number 


of external nodes, are described by the equation V = Z + MSET?(V). The functional equation 
determining V (z) is 


1 1 
(76) V(z)=zt+ iC men Ae VQQ=zt2 424224 4327 4--- ; 
The asymptotic analysis of the coefficients (EJS 4001190) was carried out by Otter [466] who 
established an estimate of type (75). The quantity V;, is also the number of structurally distinct 


products of n elements under a commutative non-associative binary operation. dq 


> 1.45. Hierarchies. Define the class K of hierarchies to be trees without nodes of outdegree 1 
and size determined by the number of external nodes. We have (Cayley 1857, see [67, p.43]) 


K=Z+MSETs2(K) = K(zZ)= xt ; Jew (ate) + SK) +) os i). 


from which the first values are found (EIS A000669) 


K() =z 427 +223 + 524 4 1227 +. 332° + 902 + 26128 + 7662? + 2312z!9 +... 


These numbers also enumerate hierarchies in statistical classification theory [585]. They are the 
non-planar analogues of the Hipparchus—Schréder numbers on p. 69. 


> 1.46. Non-plane series—parallel networks. Consider the class SP of series—parallel networks 
as previously considered in relation to the Hipparchus example, p. 69, but ignoring planar em- 
beddings: all parallel arrangements of the (serial) networks s;,..., 5, are considered equiva- 
lent, while the linear arrangement in each serial network matters. For instance, for n = 2, 3: 


=e Ea) eee fo} “oJ OH 
Thus, S$ P) = 2 and SP3 = 5. This is modelled by the grammar: 
S = Z + SEQs2(P), P = Z+MSET32(S), 


O 
O: 


Lo- 


OO 


and, avoiding to count networks of one element twice, 


SP(z) = S(z) + P() —z =z +222 +523 + 1524 + 4825 + 167z° + 60227 + 225628 +... , 
(EIS 4003430). These objects are usually described as networks of electric resistors. J 
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1.5.3. Related constructions. Trees underlie recursive structures of all sorts. A 
first illustration is provided by the fact that the Catalan numbers, C, = a ee count 
general trees (G) of size n + 1, binary trees (B) of size n (if size is defined as the 
number of internal nodes), as well as triangulations (7) comprised of 7 triangles. 
The combinatorialist John Riordan even coined the name Catalan domain for the area 
within combinatorics that deals with objects enumerated by Catalan numbers, and 
Stanley’s book contains an exercise [554, Ex. 6.19] whose statement alone spans ten 
full pages, with a list of 66 types of object(!) belonging to the Catalan domain. We 
shall illustrate the importance of Catalan numbers by describing a few fundamental 
correspondences (combinatorial isomorphisms, bijections) that explain the occurrence 
of Catalan numbers in several areas of combinatorics. 


Rotation of trees. The combinatorial isomorphism relating G and B (albeit with 
a shift in size) coincides with a classical technique of computer science [377, §2.3.2]. 
To wit, a general tree can be represented in such a way that every node has two types 
of links, one pointing to the left-most child, the other to the next sibling in left-to-right 
order. Under this representation, if the root of the general tree is put aside, then every 
node is linked to two other (possibly empty) subtrees. In other words, general trees 
with n nodes are equinumerous with pruned binary trees with n — | nodes: 


Gn = Bb, —l- 


Graphically, this is illustrated as follows: 


The right-most tree is a binary tree drawn in a conventional manner, following a 45° 
tilt. This justifies the name of “rotation correspondence” often given to this transfor- 
mation. 


Tree decomposition of triangulations. The relation between binary trees B and 
triangulations JT is equally simple: draw a triangulation; define the root triangle as 
the one that contains the edge connecting two designated vertices (for instance, the 
vertices numbered 0 and 1); associate to the root triangle the root of a binary tree; 
next, associate recursively to the subtriangulation on the left of the root triangle a left 
subtree; do similarly for the right subtriangulation giving rise to a right subtree. 
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Under this correspondence, tree nodes correspond to triangle faces, while edges con- 
nect adjacent triangles. What this correspondence proves is the combinatorial isomor- 
phism 

Th = Bn. 


We turn next to another type of objects that are in correspondence with trees. 
These can be interpreted as words encoding tree traversals and, geometrically, as paths 
in the discrete plane Z x Z. 


Tree codes and Lukasiewicz words. Any plane tree can be traversed starting from 
the root, proceeding depth-first and left-to-right, and backtracking upwards once a 
subtree has been completely traversed. For instance, in the tree 


(77) — 


the first visits to nodes take place in the following order 
a, b, d, h, e, f, c, g, i, j. 


(Note: the tags a, b, ..., added for convenience in order to distinguish between nodes, 
have no special meaning; only the abstract tree shape matters here.) This order is 
known as preorder or prefix order since a node is preferentially visited before its 
children. 

Given a tree, the listing of the outdegrees of nodes in prefix order is called the 
preorder degree sequence. For the tree of (77), this is 


o = (2,3, 1,0,0, 0, 1, 2,0, 0). 


It is a fact that the degree sequence determines the tree unambiguously. Indeed, given 
the degree sequence, the tree is reconstructed step by step, adding nodes one after the 
other at the left-most available place. For o, the first steps are then 


Next, if one represents degree j by a “symbol” f;, then the degree sequence becomes 
a word over the infinite alphabet F = { fo, fi, ...}, for instance, 


o~ fo ffi fofofohi so fo fo- 


This can be interpreted in the language of logic as a denotation for a functional term 
built out of symbols from F, where f; represents a function of degree (or “arity”) 
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j. The correspondence even becomes obvious if superfluous parentheses are added at 
appropriate places to delimit scope: 


a ~ f2(fa(fi (fo), fo, fo), fi(f2(fo, fo)))- 


Such codes are known as Lukasiewicz codes“, in recognition of the work of the Polish 
logician with that name. Jan Lukasiewicz (1878-1956) introduced them in order to 
completely specify the syntax of terms in various logical calculi; they prove nowadays 
basic in the development of parsers and compilers in computer science. 

Finally, a tree code can be rendered as a walk over the discrete lattice Z x Z. 
Associate to any f; (i.e., any node of outdegree j) the displacement (1, j—1) € ZxZ, 
and plot the sequence of moves starting from the origin. In our example we find: 


fo fh fo fo fo fi fo fo fo 


1 2 0 -1 -1 -1 0 1 -1 -1. 

There, the last line represents the vertical displacements. The resulting paths are 
known as Lukasiewicz paths. Such a walk is then characterized by two conditions: 
the vertical displacements are in the set {—1, 0, 1, 2, ...}; all its points, except for the 
very last step, lie in the upper half-plane. 

By this correspondence, the number of Lukasiewicz paths with n steps is the 


shifted Catalan number, 2M hones 


> 1.47. Conjugacy principle and cycle lemma. Let £ be the class of all Lukasiewicz paths. 
Define a “relaxed” path as one that starts at level 0, ends at level —1 but is otherwise allowed 
to include arbitrary negative points; let / be the corresponding class. Then, each relaxed path 
can be cut-and-pasted uniquely after its left-most minimum as described here: 


This associates to every relaxed path of length v a unique standard path. A bit of combinatorial 
reasoning shows that correspondence is 1-to-v (each element of £ has exactly v preimages.) 
One thus has M, = vL,. This correspondence preserves the number of steps of each type 
(fo. fi, ---), 8o that the number of Lukasiewicz paths with v; steps of type f; is 


1 vod 
— [x tagdatt ++] (xT hug + ay ay +2703 +--+) =7( ‘ ), 
Vv V \VO; Y1>--- 


144 less dignified name is “Polish prefix notation”. The “reverse Polish notation” is a variant based 
on postorder that has been used in some calculators since the 1970s. 
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under the necessary condition (—1)vg + Ovy + lv2 + 23 +--- = —1. This combinatorial way 
of obtaining refined Catalan statistics is known as the conjugacy principle [503] or the cycle 
lemma (129, 155, 184]. It is logically equivalent to the Lagrange Inversion Theorem, as shown 
by Raney [503]. Dvoretzky & Motzkin [184] have employed this technique to solve a number 
of counting problems related to circular arrangements. dq 


Example 1.16. Binary tree codes and Dyck paths. Walks associated with binary trees have 
a very special form since the vertical displacements can only be +1 or —1. The paths result- 
ing from the Lukasiewicz correspondence are then equivalently characterized as sequences of 
numbers x = (x0, X41, ---,2n,X2n+41) Satisfying the conditions 

(78) x9 =0; xj 20 forl <j <2n; Ixj41 —2j| = 15 Mn4.= 1. 
These coincide with “gambler ruin sequences’, a familiar object from probability theory: a 
player plays head and tails. He starts with no capital (xq = 0) at time 0; his total gain is x; at 
time j; he is allowed no credit (x; > 0) and loses at the very end of the game «2,41 = —1; his 
gains are +1 depending on the outcome of the coin tosses (lx j44 — Xj | = 1). 

It is customary to drop the final step and consider “excursions’ that take place in the upper 
half-plane. The resulting objects defined as sequences (x9 = 0,x1,...,X2n-1,X2n = 0) 
satisfying the first three conditions of (78) are known in combinatorics as Dyck paths! . By 
construction, Dyck paths of length 2n correspond bijectively to binary trees with n internal 
nodes and are consequently enumerated by Catalan numbers. Let D be the combinatorial class 
of Dyck paths, with size defined as length. This property can also be checked directly: the 
quadratic decomposition 


haste .° . he ait 


(79) 
D = fe} + (ADN)xD 
=> D(z) = 1 + (¢D(z)z) D(z). 
From this OGF, the Catalan numbers are found (as expected): D2, = wat (2 ). The decom- 


position (79) is known as the “first passage” decomposition as it is based on the first time the 
accumulated gain in the coin-tossing game passes through the value zero. 

Dyck paths also arise in connection will well-parenthesized expressions. These are recog- 
nized by keeping a counter that records at each stage the excess of the number of opening 
brackets “(” over closing brackets “)”. Finally, one of the origins of the Dyck path is the famous 
ballot problem, which goes back to the nineteenth century [423]: there are two candidates A 
and B that stand for election, 2n voters, and the election eventually results in a tie; what is the 
probability that A is always ahead of or tied with B when the ballots are counted? The answer is 


Don 1 


2, eg ag 
60 aa 


since there are Ce") possibilities in total, of which the number of favourable cases is D2,,, a Cata- 


lan number. The central r6le of Dyck paths and Catalan numbers in problems coming from such 
diverse areas is quite remarkable. Section V. 4, p. 318 presents refined counting results regarding 
lattice paths (e.g., the analysis of height) and Subsection VII. 8.1, p. 506 introduces exact and as- 
ymptotic results in the harder case of an arbitrary finite collection of step types (not just +1). Hf 


'SDyck paths are closely associated with free groups on one generator and are named after the German 
mathematician Walther (von) Dyck (1856-1934) who introduced free groups around 1880. 
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> 1.48. Dyck paths, parenthesis systems, and general trees. The class of Dyck paths admits an 
alternative sequence decomposition 


» phe rie . stlbe 4 it 


D = SEQ(Z2xDx Z), 


which again leads to the Catalan GF. The decomposition (80) is known as the “arch decom- 
position” (see Subsection V.4.1, p. 319, for more). It can also be directly related to traversal 
sequences of general trees, but with the directions of edge traversals being recorded (instead of 
traversals based on node degrees): for a general tree r, define its encoding «(z) over the binary 
alphabet {.7, \\} recursively by the rules: 


x(t) =€, K(e(T],...,T)) =A K(T)° + K(T)N. 


This is the classical representation of trees by a parenthesis system (interpret “”” and “\” as 
“( and “)’”, respectively), which associates to a tree of n nodes a path of length 2n — 2. J 


> 1.49. Random generation of Dyck paths. Dyck paths of length 2n can be generated uniformly 
at random in time linear in n. (Hint: By Note I.47, it suffices to generate uniformly a sequence 
of n as and (n + 1) bs, then reorganize it according to the conjugacy principle.) 


> 1.50. Excursions, bridges, and meanders. Adapting a terminology from probability theory, 
one sets the following definitions: (7) a meander (M) is a word over {—1, +1}, such that the 
sum of the values of any of its prefixes is always a non-negative integer; (ii) a bridge (B) isa 
word whose values of letters sum to 0. Thus a meander represents a walk that wanders in the 
first quadrant; a bridge, regarded as a walk, may wander above and below the horizontal line, 
but its final altitude is constrained to be 0; an excursion is both a meander and a bridge. Simple 
decompositions provide 


D(z) 1 
M =e NS B =a Dee 
@) 1 — zD(z) @) 1 — 222 D(z) 
implying My = (1n/2\) [ETS A001405] and Bo, = (?") [ETS A000984]. J 


> 1.51. Motzkin paths and unary—binary trees. Motzkin paths are defined by changing the 
third condition of (78) defining Dyck paths into |x j44 —x il < 1. They appear as codes for 
unary—binary trees and are enumerated by the Motzkin numbers of Note 1.39, p. 68. dq 


Example 1.17. The complexity of boolean functions. | Complexity theory provides many 
surprising applications of enumerative combinatorics and asymptotic estimates. In general, 
one starts with a finite set of abstract mathematical objects Q and a combinatorial class D 
of concrete descriptions. By assumption, to every element of 6 € D is associated an object 
ud) € Q, its “meaning”; conversely any object of Q admits at least one description in D 
(that is, the function 4 is surjective). It is then of interest to quantify properties of the shortest 
description function defined for w € Q as 


o(w) :=min {dlp | «(6) =o}, 
and called the complexity of the element w € © (with respect to D). 
We take here © to be the class of all boolean functions on m variables. Their number is 
JQ] = 22". As descriptions, we adopt the class of logical expressions involving the logical 
connectives V, A and pure or negated variables. Equivalently, D is the class of binary trees, 
where internal nodes are tagged by a logical disjunction (““V’”’) or a conjunction (“A”), and each 
external node is tagged by either a boolean variable of {x,,..., xm} or a negated variable of 
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{-x1,...,—7Xm}. Define the size of a tree description as the number of internal nodes; that is, 
the number of logical operators. Then, one has 
1 2 
(81) Dy = ( ( ")) Qn. (2m)"t1, 
n+1\n 


as seen by counting tree shapes and possibilities for internal as well as external node tags. 
The crux of the matter is that if the inequality 


Vv 
(82) > Pj < 1a. 
j=0 


holds, then there are not enough descriptions of size < v to exhaust Q. (This is analogous to the 
coding argument of Note I.23, p. 53.) In other terms, there must exist at least one object in Q 
whose complexity exceeds v. If the left side of (82) is much smaller than the right side, then it 
must even be the case that “most” Q-objects have a complexity that exceeds v. 

In the case of boolean functions and tree descriptions, the asymptotic form (33) is available. 
From (81) it can be seen that, for n, v getting large, one has 


v 
Dn = O(16"m"n~*/*), "Dj = O16" m*v~9/°). 
j=0 


Choose v such that the second expression is 0(Q|]), which is ensured for instance by taking for 
v the value 
gm 
v(m) := ————_.. 
4+ logy m 


With this choice, one has the following suggestive statement: 


A fraction tending to 1 (as m — oo) of boolean functions in m variables have tree complexity 
at least 2” /(4 + logy m). 


Regarding upper bounds on boolean function complexity, a function always has a tree 
complexity that is at most 2m+! _ 3. To see this, note that for m = 1, the four functions are 


0=(Q,A7x1), L=@V7%4), x1, 7%]. 


Next, a function of m variables is representable by a technique known as the binary decision 
tree (BDT), 


fQ1, +++. Xm=1,%m) = (-Xm A fs +++ Xm-1> 0)) Vv (xm A FOL, +++ Xm-1 1)) > 


which provides the basis of the induction as it reduces the representation of an m—ary func- 
tion to the representation of two (m — 1)-ary functions, consuming on the way three logical 
connectives. 

Altogether, basic counting arguments have shown that “most” boolean functions have a 
tree-complexity (2”” / log m) that is fairly close to the maximum possible, namely, O(2”"). A 
similar result has been established by Shannon for the measure called circuit complexity: cir- 
cuits are more powerful than trees, but Shannon’s result states that almost all boolean functions 
of m variables have circuit complexity O(2™ /m). See the chapter by Li and Vitanyi in [591] 
and Gardy’s survey [283] on random boolean expressions for a discussion of such counting 
techniques within the framework of complexity theory and logic. We resume this thread in Ex- 
ample VII.17, p. 487, where we quantify the probability that a large random boolean expression 
computes a fixed function. ....... 0... nee teen nent teen en eees |_| 
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1.5.4. Context-free specifications and languages. Many of the combinatorial 
examples encountered so far in this section can be organized into a common frame- 
work, which is fundamental in formal linguistics and theoretical computer science. 


Definition 1.13. A class C is said to be context-free if it coincides with the first com- 
ponent (T = S}) of a system of equations 

S| = B1(Z, S1,...,S,) 
(83) 


S; = 3(Z,S1,...,S,), 


where each § ; is a constructor that only involves the operations of combinatorial sum 
(+) and cartesian product (x), as well as the neutral class, E = {e}. 

A language L is said to be an unambiguous context-free language if it is combi- 
natorially isomorphic to a context-free class of trees: C = T. 


The classes of general trees (G) and binary trees (8) are context-free, since they 
are specifiable as 


G 
iF. 


ZxF 
{fe} +(G x F), B=2Z+(6x B); 


here F designates ordered forests of general trees. Context-free specifications may 
be used to describe all sorts of combinatorial objects. For instance, the class UU = 
T \ Zo of non-empty triangulations of convex polygons (Note 10, p. 36) is specified 
symbolically by 


(84) U=V+t(VxU)tUxV)+UxKVxY), 


where V = Z represents a generic triangle. The Lukasiewicz language and the set of 
Dyck paths are context-free classes since they are bijectively equivalent to G and U/. 

The term “context-free” comes from linguistics: it stresses the fact that objects 
can be “freely” generated by the rules of (83), this without any constraints imposed 
by an outside context!©. There, one classically defines a context-free language as 
the language formed with words that are obtained as sequences of leaf tags (read in 
left-to-right order) of a context-free variety of trees. In formal linguistics, the one-to- 
one mapping between trees and words is not generally imposed; when it is satisfied, 
the context-free language is said to be unambiguous; in such cases, words and trees 
determine each other uniquely, cf Note 1.54 below. 

An immediate consequence of the admissibility theorems is the following propo- 
sition first encountered by Chomsky and Schiitzenberger [119] in the course of their 
research relating formal languages and formal power series. 


16Formal language theory also defines context-sensitive grammars where each rule (called a produc- 
tion) is applied only if it is enabled by some external context. Context-sensitive grammars have greater 
expressive power than context-free ones, but they depart significantly from decomposability and are sur- 
rounded by strong undecidability properties. Accordingly, context-sensitive grammars cannot be associated 
with any global generating function formalism. 


80 I, COMBINATORIAL STRUCTURES AND ORDINARY GENERATING FUNCTIONS 


21S Ee 


Figure 1.16. A directed animal, its tilted version, (after a +z /4 rotation), and three 
of its equivalent representations as a heap of dimers. 


Proposition I.7. A combinatorial class C that is context-free admits an OGF that is 
an algebraic function. In other words, there exists a (non-null) bivariate polynomial 
P(z, y) € C[z, y] such that 


P(z, C(z)) = 0. 


Proof. By the basic sum and product rules, the context-free system (83) translates into 
a system of OGF equations, 


Si(z) = @1(z, Si(z),..., 8-(Z)) 


S+(z) = ®,(z, Si(z),..., S-(z)), 


where the ® ; are the polynomials translating the constructions §;. 

It is then well known that algebraic elimination is possible in polynomial sys- 
tems. Here, it is possible to eliminate the auxiliary variables S2,..., S,, one by one, 
preserving the polynomial character of the system at each stage. The end result is 
then a single polynomial equation satisfied by C(z) = S,(z). (Methods for effec- 
tively performing polynomial elimination include a repeated use of resultants as well 
as Grobner basis algorithms; see Appendix B.1: Algebraic elimination, p. 739 for a 
brief discussion and references.) | 


Proposition I.7 is a counterpart of Proposition I.3 (p. 57) according to which ratio- 
nal generating functions arise from finite state devices, and it justifies the importance 
of algebraic functions in enumeration theory. We shall encounter applications of such 
algebraic generating functions to planar non-crossing configurations (p. 485) walks 
(p. 506) and planar maps (p. 513), when we develop a general asymptotic theory of 
their coefficients in Chapter VII, based on singularity theory. The example below 
shows the way certain lattice configurations can be modelled by a context-free speci- 
fication. 


Example 1.18. Directed animals. Consider the square lattice Z*. A directed animal with a 
compact source of size k is a finite set of points a of the lattice such that: (7) forO0 <i < k, the 
points (—i, 7), called source points, belong to a; (ii) all other points in a can be reached from 
one of the source points by a path made of North and East steps and having all its vertices in a. 
(The animal in Figure I.16 has one source.) Such lattice configurations have been introduced 
by statistical physicists Dhar et al. [162], since they provide a tractable model of 2-dimensional 
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percolation. Our discussion follows Bousquet-Mélou’s insightful presentation in [84], itself 
based on Viennot’s elegant theory of heaps of pieces [597]. 

The best way to visualize an animal is as follows (Figure I.16): rotate the lattice by +7 /4 
and associate to each vertex of the animal a horizontal piece, also called a dimer. The length of 
a piece is taken to be slightly less than the diagonal of a mesh of the original lattice. Pieces are 
allowed to slide vertically (up or down) in their column, but not to jump over each other. One 
can then think of an animal as being a heap of pieces, where pieces take their places naturally, 
under the effect of gravity, and each one stops as soon as it is blocked by a piece immediately 
below. (The heap associated to an animal satisfies the additional property that no two pieces in 
a column can be immediately adjacent to one another.) 

Define a pyramid to be a one-source animal and a half-pyramid to be a pyramid that has 
no vertex strictly to the left of its source point, in the tilted representation. Let P and H be 
respectively the class of pyramids and half-pyramids, viewed as heaps. By a corner decomposi- 
tion (Note I.52), pyramids and half-pyramids can be constructed as suggested by the following 
diagram: 


[s 
Ea 
Le 


(85) a / 
ae 


[2 | 


The pictorial description (85) is equivalent to a context-free specification: 


P 
H 


=5 


Z+ZxH+ZxHxP H=z+zH+7zH?, 


in which the second equation, a quadratic, is readily solved to provide H, which in turn gives 
P, by the first equation. One finds: 


iff 
a 5(\ 1) 


242274523413 24 4352 4-:- 


(86) 
1-z-V0400-3 
H(z) = —— s MEN) gk BRA ee as. 


corresponding respectively to EJS A005773 and EIS 4001006 (Motzkin numbers, cf Notes I.39, 
p. 68 and 1.51, p. 77). See Example VI.3 and Note VI.11, p. 396, for relevant asymptotics. 

Similar constructions permit us to decompose compact-source directed animals, whose 
class we denote by A. For instance: 


ees. a 


Compact-source animals with k sources are then specified by P x SEQ,_1 (7H), and we have 


P 
(87) A=PxSEQ(H) = AG = 7 HE “To 
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where the last form results from basic algebraic simplifications. A consequence of (87) is the 
surpringly simple (but non-trivial) result that there are a compact-source animals of size n. 
The papers [61, 87] develop further aspects of the rich counting theory of animals. ........ |_| 


> 1.52. Understanding animals. In the first equation of (85), a pyramid z that is not a half- 
pyramid has a unique dimer which is of lowest altitude and immediately to the left of the source. 
Take that dimer and push it upwards, in the direction of imaginary infinity; it will then carry with 
it a group of dimers that constitute, by construction, a pyramid w. What remains has no dimer 
to the left of its source, and hence forms a half-pyramid y. The following diagram illustrates 
the decomposition, with the dimers of w equipped with an upward-pointing arrow: 


2 ; bis 
oa ane 
= —@e [ J 
oe = ae 


- 
-e 
—o -- 
= -~o 
—- 7 
a x= -- 


+ 


X 


Conversely, given a pair (w, y) € P x H, attach first v to the base; then, let w fall down from 
imaginary infinity. The dimers of @ will take their place above the dimers of 7, blocked in 
various manners on their way down, the whole set eventually forming a pyramid. A moment 
of reflection convinces one that the original pyramid z is recovered in this way; that is, the 
transformation z > (q, x) is bijective. 


> 1.53. “Tree-like” structures. A context-free specification can always be regarded as defining 
a class of trees. Indeed, if the jth term in the construction §; of (83) is “coloured” with the 
pair (i, 7), itis seen that a context-free system yields a class of trees whose nodes are tagged by 
pairs (7, 7) in a way consistent with the system’s rules. However, despite this correspondence, 
it is often convenient to preserve the possibility of operating directly with objects when the tree 
aspect may be unnatural. (Some authors have developed a parallel notion of “object grammars”; 
see for instance [183], itself inspired by techniques of polyomino surgery in [150].) By a termi- 
nology borrowed from the theory of syntax analysis in computer science, such trees are referred 
to as “parse trees” or “syntax trees”. dq 


> 1.54. Context-free languages. Let A be a fixed finite alphabet whose elements are called 
letters. A grammar G is a collection of equations 


Li 31 (a, £4,..., £m) 


(88) G 


Lin Sm(a, £1,...,L£m), 


where each ¥ j involves only the operations of union (U) and concatenation product (- ) witha 
the vector of letters in A. For instance, 
Bia, £1, £2, £3) = az - Ly + £30 a3UL3-a2-L). 
A solution to (88) is an m—tuple of languages over the alphabet A that satisfies the system. By 
convention, one declares that the grammar G defines the first component, £1. 
To each grammar (88), one can associate a context-free specification (60) by transforming 
unions into disjoint union, “U +> +”, and catenation into cartesian products, “ - +» x”. Let 


G be the specification associated in this way to the grammar G. The objects described by G 
appear in this perspective to be trees (see the discussion above regarding parse trees). Let h 
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be the transformation from trees of G to languages of G that lists letters in infix (i.e., left-to- 
right) order: we call such an / the erasing transformation since it “forgets” all the structural 
information contained in the parse tree and only preserves the succession of letters. Clearly, 
application of h to the combinatorial specifications determined by G yields languages that obey 
the grammar G. For a grammar G and a word w € A%*, the number of parse trees tf € G such 
that h(t) = w is called the ambiguity coefficient of w with respect to the grammar G. 

A grammar G is unambiguous if all the corresponding ambiguity coefficients are either 0 


or 1. This means that there is a bijection between parse trees of G and words of the language 
described by G: each word generated is uniquely “parsable” according to the grammar. One has, 
from Proposition I.7: The OGF of an unambiguous context-free language satisfies a polynomial 
system of the form (61), and is consequently an algebraic function. 


> 1.55. Extended context-free specifications. If A,B are context-free specifications then: 
(i) the sequence class C = SEQ(A) is context-free; (ii) the substitution class D = A[b 6], 
formally defined in the next section, is also context-free. dq 


I.6. Additional constructions 


This section is devoted to the constructions of sequences, sets, and cycles in the 
presence of restrictions on the number of components as well as to mechanisms that 
enrich the framework of core constructions; namely, pointing, substitution, and the 
use of implicit combinatorial definitions. 


1.6.1. Restricted constructions. An immediate formula for OGFs is that of the 

diagonal A of a cartesian product 6 x B defined as 
A= A(B x B):={(B, B) | B € B}. 
Then, one has the relation A(z) = B(z7), as shown by the combinatorial derivation 
A= > SRG, 
(B.B) 

or by the equally obvious observation that Az, = By. 

The diagonal construction permits us to access the class of all unordered pairs of 
(distinct) elements of B, which is A = PSET2(B). A direct argument then runs as 
follows: the unordered pair {a, £} is associated to the two ordered pairs (a, £) and 


(6, a) except when a = f, where an element of the diagonal is obtained. In other 
words, one has the combinatorial isomorphism, 


PSET2(B) + PSET2(B) + A(B x B) = Bx B, 


meaning that 

2A(z) + B(z*) = BG)’. 
This gives the translation of PSET2, and, by a similar argument for MSET2 and CYC2 
(observe also that CYC2 = MSET 2), one has: 


A = PSET2(B) => A(z) = 5B(z)? — 5B(z’) 
A = MSET2(B) = A(z) = 5B(z)? + 4B(2’) 
A = CYcC2(B) = A(z) = 5B(z) + 5B(z’). 
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This type of direct reasoning could in principle be extended to treat triples, and so 
on, but the computations easily grow out of control. The classical treatment of these 
questions relies on what is known as Polya theory, of which we offer a glimpse in 
Notes I.58-I.60. We follow instead here an easier global approach, based on mullti- 
variate generating functions, that suffices to generate simultaneously all cardinality- 
restricted constructions of our standard collection. 


Theorem I.3 (Component-restricted constructions). The OGF of sequences with k 
components A = SEQ, (B) satisfies 


A(z) = B(z)*. 
The OGF of sets, A = PSET;(B), is a polynomial in the quantities B(z), ..., B(z*), 


A(z) = [u“Jexp (Fa - “ B(e) a CB’) aor ) 
The OGF of multisets, A = MSET, (Bb), is 
A(z) = [u“Jexp (Fa + <a) 4 <a) Es otek ) 
The OGF of cycles, A= CYC, (B), is, with g the Euler totient function (p. 721) 


— o(€ 1 
Ae) = 01> 2 tog 
t=1 


1—ul Bz) 
The explicit forms for small values of k are summarized in Figure 1.18, p. 93. 


Proof. The result for sequences is obvious since SEQ; (8) means B x --- x B (k 
times). For the other constructions, the proof makes use of the techniques of Theo- 
rem I.1, p. 27, but it is best based on bivariate generating functions that are otherwise 
developed fully in Chapter III to which we refer for details (p. 171). The idea consists 
in describing all composite objects and introducing a supplementary marking variable 
to keep track of the number of components. 

Take & to be a construction among SEQ, Cyc, MSET, PSET. Consider the rela- 
tion A = &(B), and let y (a) for a € A be the parameter “number of B—components”’. 
Define the multivariate quantities 


An.k ‘= card {a eA | lal =n, x(a) = k} 
A(z, u) = >) Anxukz” = s* lal yx) 
n,k acA 
For instance, a direct calculation shows that, for sequences, 
1 
A = kB) = —_. 
ju) = Duk Be) iO) 


k>0 
For multisets and powersets, a simple adaptation of the already seen argument gives 
A(z, u) as 


Au) =[[G-uz")*, AG uw) =[]G+uz")*, 
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respectively. The result follows from here by the exp—log transformation upon ex- 
tracting [u«]A(z, uw). The case of cycles results from the bivariate generating function 
derived in Appendix A.4: Cycle construction, p. 729 (alternatively use Note 1.60). Mf 


> 1.56. Aperiodic words. An aperiodic word is a primitive sequence of letters (in the sense 
of Appendix A.4: Cycle construction, p. 729); that is, the word w is aperiodic provided it is 
not obtained by repetition of a proper factor: w # u---u. The number of aperiodic words of 
length n over an m—ary alphabet is (with w(k) the Mébius function, p. 721) 


pw) = SS u(d)ym"/4, 
d|n 


For m = 2, the sequence starts as 2, 2, 6, 12, 30, 54, 126, 240, 504, 990 (EIS A027375). <J 


> 1.57. Around the cycle construction. A calculation with arithmetical functions (APPENDIX A, 
p. 721) yields the OGFs of multisets of cycles and multisets of aperiodic cycles as 


1 1 
Ve ai ee 
= k = i 
kel 1 — A(z*) 1 — A(z) 
respectively [144]. (The latter fact corresponds to the combinatorial property that any word can 


be written as a decreasing product of Lyndon words; notably, it serves to construct bases of free 
Lie algebras [413, Ch. 5].) J 


> 1.58. Poélya theory I: the cycle indicator. Consider a finite set M of cardinality m and a 
group G of permutations of \/. Whenever convenient, the set MM can be identified with the 
interval [1..m]. The cycle indicator (“Zyklenzeiger’) of G is, by definition, the multivariate 
polynomial 


1 $ , 
Z(G) = Z(G; x},...,X%m) = eG » xii) vee xd (8) 
geG 


where jx(g) is the number of cycles of length k in the permutation g. For instance, if 3m = 
{Id} is the group reduced to the identity permutation, G, is the group of all permutations of 
size m, and Ry» is the group consisting of the identity permutation and the “mirror-reflection” 


permutation GC ") , then 


Jl Jim 
x ow. x 
Z(Gm) =x"; Z(Gm)= YD ——*_; 
j j sgale Jm!mJm 
Ilo JIm2 
(89) 
Z%m) 5x3 + 5axav if m = 2v is even 
m) = 
5xixh + 5x7't! ifm = 2v t+ Lis odd. 
(For the case of G, see Equation (40), Chapter III, p. 188.) J 


> 1.59. Polya theory I: the fundamental theorem. Let B be a combinatorial class and M a 
finite set on which the group G acts. Consider the set B™ of all mappings from M into B. 
Two mappings ¢1, ¢2 € B™ are declared to be equivalent if there exists a g € G such that 
$1 0g = do, and we let (BM /G) be the set of equivalence classes. The problem is to enumer- 
ate (BM /G), given the data B, M, and the “symmetry group” G. 

Let w be a weight function that assigns to any 8 € Ba weight w(f); the weight is extended 
multiplicatively to any ¢ € B™. hence to (BM /G), by w() := Tre ny w(K). The Poélya- 
Redfield Theorem expresses the identity 


(90) > 0) =Z1G: > w(f),..., D5 wy” 
~e(BM/G) beB beB 
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In particular, we can choose w(f) = z/4l with z a formal parameter; the Pélya—Redfield 
Theorem (90) then provides the OGF of objects of BM up to symmetries by G: 


(91) S atlaZ(G BOs. Be): 
¢e(BM/G) 
(There are many excellent presentations of this classic theory, starting with Pélya himself [488, 


491]; see for instance Comtet [129, §6.6], De Bruijn [142], and Harary—Palmer [319, Ch. 2]. 
The proof relies on orbit counting and Burnside’s lemma.) dq 


> 1.60. Pélya theory III: basic constructions. Say we want to obtain the OGF of A = 


MSET3(B). We view A as the set of triples BM, with M = [1..3], taken up to G3, the 
set of all permutations of three elements. The cycle indicator is given by (89), from which 
the translation of MSET3 results (see Figure I.18, p. 93, for the outcome); the calculation ex- 
tends to all MSET,, providing an alternative approach to Theorem I.3. The translation of the 
CYC construction can be obtained in this way via the cycle index of the group €jm of all cyclic 
permutations; namely, 


1 n 
Z(€m) = — D> o(d)xy!", 


d|m 
where y(k) is the Euler totient function. The use of the groups im gives rise to the undirected 
sequence construction, 


1 1 11+8 
A=USEQ(B) = AW®)=57— BZ) 21 ot 


where a sequence and its mirror image are identified. Similar principles give rise to the undi- 
rected cycle construction UCYC, generated by cyclic permutations and mirror reflection. (The 
approach taken in the text can be seen, in the perspective of Pélya theory, as a direct deter- 
mination of >7,,+9 Z(@m), for an entire family of symmetry groups {Gm}, where Gm = 
Em, Gm...) dq 


> 1.61. Sets with distinct component sizes. Let A be the class of the finite sets of elements from 
B, with the additional constraint that no two elements in a set have the same size. One has 


CO 
A(z) = [] + Bnz”). 
n=1 
Similar identities serve in the analysis of polynomial factorization algorithms [236]. dq 


> 1.62. Sequences without repeated components. The generating function is formally 


i pal j —u 
xP VEY = BG!) e “du. 


jzi 


(This representation is based on the Eulerian integral: k! = Ie e“uk du.) dq 


1.6.2. Pointing and substitution. Two more constructions, namely pointing and 
substitution, translate agreeably into generating functions. Combinatorial structures 
are viewed as always as formed of atoms (letters, nodes, etc), which determine their 
sizes. Pointing means “pointing at a distinguished atom”; substitution, written B o C 
or B[C], means “substitute elements of C for atoms of B”. 


Definition [.14. Let {€), €2,...} be a fixed collection of distinct neutral objects of 
size 0. The pointing of a class B, denoted A = ©B, is formally defined as 


OB := > x {€1,...,€n}.- 


n>0 
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The substitution of C into B (also known as composition of B and C), noted BoC 
or B(C], is formally defined as 


BoC =BIC]:= >) By x SEQ). 
k>0 


With B,, the number of & structures of size n, the quantity n B, can be interpreted 
as counting pointed structures where one of the n atoms composing a B-structure has 
been distinguished (here by a special “pointer” of size 0 attached to it). Elements of 
B oC may also be viewed as obtained by selecting in all possible ways an element 
f € B and replacing each of its atoms by an arbitrary element of C, while preserving 
the underlying structure of /. 

The interpretations above rely (silently) on the fact that atoms in an object can 
be eventually distinguished from each other. This can be obtained by “canonicaliz- 
ing’”!” the representations of objects: first define inductively the lexicographic order- 
ing for products and sequences; next represent powersets and multisets as increasing 
sequences with the induced lexicographic ordering (more complicated rules can also 
canonicalize cycles). In this way, any constructible object admits a unique “rigid” 
representation in which each particular atom is determined by its place. Such a canon- 
icalization thus reconciles the abstract definitions of Definition I.14 with the intuitive 
interpretation of pointing and substitution. 


Theorem I.4 (Pointing and substitution). The constructions of pointing and substitu- 
tion are admissible!®: 


d 
A=0OB => AZ) =20,BQZ) 4 := a 


A=BoC = A(z) = B(C(z)) 
Proof. By the definition of pointing, one has 
An =n- Bn, so that A(z) = z0,B(z). 


The definition of substitution implies, by the sum and product rules, 


A@) = >) Be (C@))‘ = BIC), 


k>0 


and the proof is completed. | 


'7 Such canonicalization techniques also serve to develop fast algorithms for the exhaustive listing 
of objects of a given size as well as for the range of problems known as “ranking” and “unranking”, with 
implications in fast random generation. See, for instance, [430, 456, 607] for the general theory as well 
as [500, 623] for particular cases such as necklaces and trees. 

!8I this book, we borrow from differential algebra the convenient notation 0, := 4: to represent 


derivatives. 
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Permutations as pointed objects. As an example of pointing, consider the class P 
of all permutations written as words over integers starting from 1. One can go from a 
permutation of size n — | to a permutation of size n by selecting a “gap” and inserting 
the value n. When this is done in all possible ways, it gives rise to the combinatorial 
relation 


P=E+O0(2xP), E = {e}, => P(z)= 1 +24-@P@). 


The OGF satisfies an ordinary differential equation whose formal solution is P(z) = 
> 20 7!z", since it is equivalent to the recurrence P, =n Py}. 


Unary-binary trees as substituted objects. As an example of substitution, con- 
sider the class 6 of (plane—rooted) binary trees, where all nodes contribute to size. If 
at each node a linear chain of nodes (linked by edges placed on top of the node) is 
substituted, one forms an element of the class M of unary—binary trees; in symbols: 


M=BoSzQs(Z) => m= 2 (7). 


Thus from the known OGF, B(z) = (1 — V1 — 4z*)/(2z), one derives 


Ia) = 4g = gyre gal Oz = 8a? 
2z(1 — z)7! Pe 2z ; 
which matches the direct derivation on p. 68 (Motzkin numbers). 


> 1.63. Combinatorics of derivatives. The combinatorial operation D of “erasing—pointing” 
points to an atom in an object and replaces it by a neutral object, otherwise preserving the 
overall structure of the object. The translation of D on OGFs is then simply 0 := 0z. Classical 
identities of analysis then receive transparent combinatorial interpretations: for instance, 


(A x B) = (A x 0B) + (0A x B) 
as well as Leibniz’s identity, 0" (f - g) = par (*) (a/ f) - (6"—J g), also follow from basic 


logic. Similarly, for the “chain rule” 0(f o g) = ((Of) o g) - 6g. (Example VII.25, p. 529, 
illustrates the use of these methods for analytically solving many urn processes.) dq 


M(z) = 


> 1.64. The combinatorics of Newton—Raphson iteration. Given a real function f, the iter- 
ation scheme of Newton-Raphson finds (conditionally) a root of the equation f(y) = 0 by 
repeated use of the transformation a* = a — f(a)/f’(a), starting for instance from a = 0. 
(For sufficiently smooth functions, this scheme is quadratically convergent.) The application of 
Newton-Raphson iteration to the equation y = z#(y) associated with a simple variety of trees 
in the sense of Proposition I.5, p. 66, leads to the scheme: 


zP(Gm) — Om : 
1 —29'(am) ” 
It can be seen, analytically and combinatorially, that a, has a contact of order at least 2” — 1 
with y(z). The interesting combinatorics is due to Décoste, Labelle, and Leroux [147]; it in- 


volves a notion of “heavy” trees (such that at least one of the root subtrees is large enough, in a 
suitable sense); see [50, §3.3] and [485] for further developments. 


Oom+1 = Om + ag = 0. 


1.6.3. Implicit structures. There are many cases where a combinatorial class V 
is determined by arelation A = 6+, where A and B are known. (An instance of this 
is the equational technique of Subsection I. 4.2, p. 56 for enumerating words that do 
contain a given pattern p.) Less trivial examples involve inverting cartesian products 
as well as sequences and multisets (examples below). 
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Theorem I.5 (Implicit specifications). The generating functions associated to the im- 
plicit equations with unknown X 


A=B+4%, A=BxX, A = SEQ(X), 
are, respectively, 
A 1 
AG) X(z)=1 
B(z) 
For the implicit construction A = MSET(), one has 


x)= tog ae, 


k>1 


X(z) = A(z) — B(z), X(z) = 


where y(k) is the Mobius function”. 


Proof. The first two cases result from kindergarten algebra, since in terms of OGFs 
one has A = B+ X and A = BX, respectively. For sequences, the relation A(z) = 
(1 — X(z))7! is readily inverted as stated. For multisets, start from the fundamental 
relation of Theorem I.1 (p. 27) and take logarithms: 


co 


1 
log(A@) = D7 XC). 
k=1 
Let L = log A and Ly, = [z”]L(z). One has 
nLn = > (dXa), 
d|n 
to which it suffices to apply Mobius inversion (p. 721). | 


Example 1.19. Indecomposable permutations. A permutation o = 01 --- on (written here as a 
word of distinct letters) is said to be decomposable if, for some k <n, 01 -- +o is a permutation 
of {1,...,k}; ie., a strict prefix of the permutation (in word form) is itself a permutation. 
Any permutation decomposes uniquely as a concatenation of indecomposable permutations, as 
shown in Figure I.17. 

As a consequence of our definitions, the class P of all permutations and the class Z of 
indecomposable ones are related by 


P = SEQ(Z). 


This determines /(z) implicitly, and Theorem I.5 gives 


I@)=1- PO where P(z) ga 


This example illustrates the utility of implicit constructions, and at the same time the pos- 
sibility of bona fide algebraic calculations with power series even in cases where they are diver- 
gent (Appendix A.5: Formal power series, p. 730). One finds 


T2Q=72t 24324 13244712 +461 2° + 344727 4+---, 


'9The Mobius function “(n) is u(n) = (—1)’ if 7 is the product of r distinct primes and “(n) = 0 
otherwise (Appendix A.1: Arithmetical functions, p. 721). 
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Figure I.17. The decomposition of a permutation (¢). 


where the coefficients (EJS A003319) are 
In=nt— >) (ying) + a inging!) - +. 


ny+ng=n ny+ng+n3=n 
nyng>1 ny.ng,n3>1 


From this, simple majorizations of the terms imply that J, ~ n!, so that almost all permutations 
are indecomposable [129, p. 262]. 2... ccc c ccc cee eect n een eee | 


> 1.65. Two-dimensional wanderings. A drunkard starts from the origin in the Z x Z plane and, 
at each second, he makes a step in either one of the four directions, NW, NE, SW, SE. The steps 
are thus \, 7, %, ‘Nv. Consider the class £ of “primitive loops” defined as walks that start and 
end at the origin, but do not otherwise touch the origin. The GF of £ is (EJS 4002894) 
1 
Qn, 2 
reokn) 2 


(Hint: a walk is determined by its projections on the horizontal and vertical axes; one-dimensional 


L(z) =1- = 477 4+ 2024 + 17629 + 187628 +---. 


walks that return to the origin in 2” steps are enumerated by es ) .) In particular [22"]L(z /4) is 
the probability that the random walk first returns to the origin in 2n steps. 

Such problems largely originate with Polya and implicit constructions were well-mastered 
by him [490]; see also [85] for certain multidimensional extensions. The first-return problem 
is analysed asymptotically in Chapter VI, p. 425, based on singularity theory and Hadamard 
closure properties. dq 


Example 1.20. Irreducible polynomials over finite fields. Objects not obviously of a combina- 
torial nature can sometimes be enumerated by symbolic methods. Here is an indirect construc- 
tion relative to polynomials over finite fields. We fix a prime number p and consider the base 
field Fp of integers taken modulo p. The polynomial ring Fp[X] is the ring of polynomials 
in X with coefficients in Fp. 

For all practical purposes, one may restrict attention to polynomials that are monic; that 
is, ones whose leading coefficient is 1. We regard the set P of monic polynomials in F p[X] 
as a combinatorial class, with the size of a polynomial being identified to its degree. Since a 
polynomial is specified by the sequence of its coefficients, one has, with A the “alphabet” of 
coefficients, A = F p treated as a collection of atomic objects, 


(92) P = SEQ(A) = P(Zy= i= 


1.6. ADDITIONAL CONSTRUCTIONS 91 


in agreement with the fact that there are p” monic polynomials of degree n. 

Polynomials are a unique factorization domain, since they can be subjected to Euclidean 
division. A polynomial that has no proper non-constant divisor is termed irreducible—irreducibles 
are thus the analogues of the primes in the integer realm. For instance, over F3, one has 


x04 x84. = (% 4:1)2(% +. 2)72(¥9 42x? 4 1). 


Let Z be the set of monic irreducible polynomials. The unique factorization property implies 
that the collection of all polynomials is combinatorially isomorphic to the multiset class (there 
may be repeated factors) of the collection of irreducibles: 


(93) P = MSET(Z) = P(z) = exp (1+ 510) + 510)4--). 


The irreducibles are thus determined implicitly from the class of all polynomials whose 
OGF is known by (92). Theorem I.5 then implies the identity 


uk) 1 1 ik 
(94) LOS eee and pg seer 


In particular, J; is asymptotic to p”/n. This estimate constitutes the density theorem for irre- 
ducible polynomials, a result already known to Gauss (see the scholarly notes of von zur Gathen 
and Gerhard in [599, p. 396]): 


The fraction of irreducible polynomials among all polynomials of degree n over the finite field 
F p is asymptotic to 5. 

This property is analogous to the Prime Number Theorem (which however lies much deeper, 
see [22, 138]), according to which the proportion of prime numbers in the interval [1,7] is 
asymptotic to 1/logn. Indeed, a polynomial of degree n appears to be roughly comparable to 
a number written in base p having n digits. (On the basis of such properties, Knopfmacher 
has further developed in [370] an abstract theory of statistical properties of arithmetical semi- 
groups.) We pursue this thread further in the book: we shall prove that the number of factors 
in a random polynomial of degree n is on average ~ logn (Example VIL.4, p. 449) and that the 
corresponding distribution is asymptotically Gaussian (Example IX.21, p. 672). .......... | 


> 1.66. Square-free polynomials. Let Q be the class of monic square-free polynomials (i.e., 
polynomials not divisible by the square of a polynomial). One has by “Vallée’s identity” (p. 30) 


O(z) = P(z)/P(z2), hence 


1 — p22 
0) = 


Berlekamp’s book [51] discusses such facts together with relations to error correcting codes. <J 


and n=p'—-p"| M22). 


> 1.67. Balanced trees. The class € of balanced 2-3 trees contains all the (rooted planar) trees 
whose internal nodes have degree 2 or 3 and such that all leaves are at the same distance from 
the root. Only leaves contribute to size. Such trees, which are particular cases of B-trees, are a 
useful data structure for implementing dynamic dictionaries [378, 537]. Balanced trees satisfy 
an implicit equation based on combinatorial substitution: 

E=Z+Eo[(Zx Z)+(Zx Zx Z)] = E()=z+E(22 +23). 


The expansion starts as (EJS A014535) 
EQ) =zt274¢34 7442542294327 44294579487! 4... : 


Odlyzko [459] has determined the growth of Ey, to be roughly as g”/n, where 9 = (1+ /5)/2 
is the golden ratio. See Subsection IV. 7.2, p. 280 for an analysis. dq 
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I.7. Perspective 


This chapter and the next amount to a survey of elementary combinatorial enu- 
merations, organized in a coherent manner and summarized in Figure I.18, in the case 
of the unlabelled universe that is considered here. We refer to the process of specify- 
ing combinatorial classes using these constructions and then automatically having ac- 
cess to the corresponding generating functions as the symbolic method. The symbolic 
method is the “combinatorics” in analytic combinatorics: it allows us to structure clas- 
sical results in combinatorics with a unifying overall approach, to derive new results 
that generalize and extend classical problems, and to address new classes of problems 
that are arising in computer science, computational biology, statistical physics, and 
other scientific disciplines. 


More importantly, the symbolic method leaves us with generating functions that 
we can handle with the “analytic” part of analytic combinatorics. A full treatment of 
this feature of the approach is premature, but a brief discussion may help place the rest 
of the book in context. 


For a given family of problems, the symbolic method typically leads to a natural 
class of functions in which the corresponding generating functions lie. Even though 
the symbolic method is completely formal, we can often successfully proceed by using 
classical techniques from complex and asymptotic analysis. For example, denumer- 
ants with a finite set of coin denominations always lead to rational generating functions 
with poles on the unit circle. Such an observation is useful as a common strategy for 
coefficient extraction can then be applied (partial fraction expansion, in the case of 
denumerants with fixed coin denominations). In the same vein, run statistics consti- 
tute a particular case of the general theorem of Chomsky and Schiitzenberger to the 
effect that the generating function of a regular language is necessarily a rational func- 
tion. Similarly, context-free structures are attached to generating functions that are 
invariably algebraic. Theorems of this sort establish a bridge between combinatorial 
analysis and special functions. 


Not all applications of the symbolic method are automatic (although that is cer- 
tainly one goal underlying the approach). The example of counting set partitions 
shows that application of the symbolic method may require finding an adequate pre- 
sentation of the combinatorial structures to be counted. In this way, bijective combi- 
natorics enters the game in a non-trivial fashion. 


Our introductory examples of compositions and partitions correspond to classes 
of combinatorial structures with explicit “iterative” definitions, a fact leading in turn 
to explicit generating function expressions. The tree examples then introduce recur- 
sively defined structures. In that case, the recursive definition translates into a func- 
tional equation that only determines the generating function implicitly. In simpler 
situations (such as binary or general trees), the generating function equations can be 
solved and explicit counting results often follow. In other cases (such as non-plane 
trees) one can usually conduct an analysis of singularities directly from the functional 
equations and obtain very precise asymptotic estimates: Chapters IV-VII of Part B 
offer an abundance of illustrations of this paradigm. The further development on a 
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1. The main constructions of disjoint union (combinatorial sum), product, sequence, powerset, 
multiset, and cycle and their translation into generating functions (Theorem I.1). 


Construction OGF 
Union A=B+C A(z) = B(z) + C(z) 
Product A=BxC A(z) = B(z)- C(z) 


Sequence A = SEQ(B) A(z) = Sa 


Powerset A=PSET(B) | A(z) = exp (20 - 5B) re +) 


Multiset A= MSET(B) | A(z) = exp (20 + 5B) aie +) 


1 1 " 
oO eee 
eee Te 


1 
Cycle A=Cyc(B) | A(z) =log = 26 + 5 1 


2. The translation for sequences, powersets, multisets, and cycles constrained by the number of 
components (Theorem 1.3, p. 84). 


SEQ; (B) :  B(z)k 


PSET,(B): BW” — BE) 


2 
Mser,(8): 2@2 4 2@) 
CYC7(B): BW? 4 8) 


PSET3(B) : Bi)” _ Be) Be) - BG) 


Mset;(B): 2@* 4 BORE) 4 BE) 


Cyex(B): 2@* 4 28@) 


PSET,(B) : Ba" = BeBe’) 4 BQ) BG!) 4 Be’ - Be") 


24 2 B(22 3 722 4 
MSET4(B) : BQ) + B(z) Be ) + B@)BE 3} + BG) + BG ) 


7)\4 2)2 4 
Crcy(B): 2@ +4 3GY 4 BO 


3. The additional constructions of pointing and substitution (Section I. 6). 


Construction OGF 
Pointing A=0B | A(zZ)= 24 Bz) 
Substitution A= BoC | A(z) = B(C(z)) 


Figure 1.18. A dictionary of constructions applicable to unlabelled structures, to- 
gether with their translation into ordinary generating functions (OGFs). (The labelled 
counterpart of this table appears in Figure II.18, p. 148.) 
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suitable perturbative theory will then lead us to systematic ways of quantifying pa- 
rameters (not just counting sequences) of large combinatorial structures—this is the 
subject of Chapter IX, in Part C of this book. 


Bibliographic notes. Modern presentations of combinatorial analysis appear in the books of 
Comtet [129] (a beautiful book largely example-driven), Stanley [552, 554] (a rich set with an 
algebraic orientation), Wilf [608] (generating functions oriented), and Lando [400] (a neat mod- 
ern introduction). An elementary but insightful presentation of the basic techniques appears in 
Graham, Knuth, and Patashnik’s classic [307], a popular book with a highly original design. An 
encyclopaedic reference is the book of Goulden & Jackson [303] whose descriptive approach 
very much parallels ours. 

The sources of the modern approaches to combinatorial analysis are hard to trace since 
they are usually based on earlier traditions and informally stated mechanisms that were well- 
mastered by practicing combinatorial analysts. (See for instance MacMahon’s book [428] Com- 
binatory Analysis first published in 1917, the introduction of denumerant generating functions 
by Polya as presented in [489, 493], or the “domino theory” in [307, Sec. 7.1].) One source in re- 
cent times is the Chomsky—Schiitzenberger theory of formal languages and enumerations [119]. 
Rota [518] and Stanley [550, 554] developed an approach which is largely based on partially 
ordered sets. Bender and Goldman developed a theory of “prefabs” [42] whose purposes are 
similar to the theory developed here. Joyal [359] proposed an especially elegant framework, the 
“theory of species”, that addresses foundational issues in combinatorial theory and constitutes 
the starting point of the superb exposition by Bergeron, Labelle, and Leroux [50]. Parallel (but 
largely independent) developments by the “Russian School” are nicely synthesized in the books 
by Sachkov [525, 526]. 

One of the reasons for the revival of interest in combinatorial enumerations and proper- 
ties of random structures is the analysis of algorithms (a subject founded in modern times by 
Knuth [381]), in which the goal is to model the performance of computer algorithms and pro- 
grams. The symbolic ideas expounded here have been applied to the analysis of algorithms 
in surveys [221, 598], with elements presented in our book [538]. Further implications of 
the symbolic method in the area of the random generation of combinatorial structures appear 
in [177, 228, 264, 456]. 


[...] une propriété qui se traduit par une égalité |A| = |B| est mieux explicitée lorsque I’on 
construit une bijection entre deux ensembles A et B, plut6t qu’en calculant les coefficients 
d’un polyn6me dont les variables n’ont pas de significations particuliéres. La méthode des 
fonctions génératrices, qui a exercé ses ravages pendant un siécle, est tombée en désuétude 

pour cette raison. 


(“[...] a property, which is translated by an equality |A| = |B|, is understood better, when one constructs 
a bijection between the two sets A and B, than when one calculates the coefficients of a polynomial whose 
variables have no particular meaning. The method of generating functions, which has had devastating 
effects for a century, has fallen into obsolescence, for this reason.” ) 


—CLAUDE BERGE [48, p. 10] 


Labelled Structures and Exponential 
Generating Functions 


Cette approche évacue pratiquement tous les calculs!, 


— DOMINIQUE FOATA & 
MARCO SCHUTZENBERGER [267] 
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II. 5. Labelled trees, mappings, and graphs 125 
IL. 6. Additional constructions 136 
II. 7. Perspective 147 


Many objects of classical combinatorics present themselves naturally as labelled struc- 
tures, where atoms of an object (typically nodes in a graph or a tree) are distinguish- 
able from one another by the fact that they bear distinct Jabels. Without loss of gen- 
erality, we may take the set from which labels are drawn to be the set of integers. For 
instance, a permutation can be viewed as a linear arrangement of distinct integers, and 
the classical cycle decomposition represents it as an unordered collection of circular 
digraphs, whose vertices are themselves integers. 

Operations on labelled structures are based on a special product: the labelled 
product that distributes labels between components. This operation is a natural ana- 
logue of the cartesian product for plain unlabelled objects. The labelled product in 
turn leads to labelled analogues of the sequence, set, and cycle constructions. 

Labelled constructions translate over exponential generating functions—the trans- 
lation schemes turn out to be even simpler than in the unlabelled case. At the same 
time, these constructions enable us to take into account structures that are in some 
ways combinatorially richer than their unlabelled counterparts of Chapter I, in par- 
ticular with regard to order properties. Labelled constructions constitute the second 
pillar of the symbolic method for combinatorial enumeration. 

In this chapter, we examine some of the most important classes of labelled objects, 
including surjections, set partitions, permutations, as well as labelled graphs, trees, 
and mappings from a finite set into itself. Certain aspects of words can also be treated 


l«rhis approach eliminates virtually all calculations.” Foata and Schutzenberger refer here to a “geo- 
metric” approach to combinatorics, much akin to ours, that permits one to relate combinatorial properties 
and special function identities. 
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by this theory, a fact which has important consequences not only in combinatorics 
itself but also in probability and statistics. In particular, labelled constructions of 
words provide an elegant solution to two classical problems, the birthday problem and 
the coupon collector problem, as well as several of their variants that have numerous 
applications in other fields, including the analysis of hashing algorithms in computer 
science. 


II. 1. Labelled classes 


Throughout this chapter, we consider combinatorial classes in the sense of Def- 
inition I.1, p. 16: we deal exclusively with finite objects; a combinatorial class A is 
a set of objects, with a notion of size attached, so that the number of objects of each 
size in A is finite. To these basic concepts, we now add that the objects are labelled, 
by which we mean that each atom carries with it a distinctive colour, or equivalently 
an integer label, in such a way that all the labels occurring in an object are distinct. 
Precisely: 


Definition II.1. A weakly labelled object of size n is a graph whose set of vertices 
is a subset of the integers. Equivalently, we say that the vertices bear labels, with 
the implied condition that labels are distinct integers from Z. An object of size n is 
said to be well-labelled, or simply, labelled, if it is weakly labelled and, in addition, 
its collection of labels is the complete integer interval [1..n]. A labelled class is a 
combinatorial class comprised of well-labelled objects. 


The graphs considered may be directed or undirected. In fact, when the need 
arises, we shall take “object” in a broad sense to mean any kind of discrete structure 
enriched by integer labels. Virtually all labelled classes considered in this book can 
eventually be encoded as graphs of sorts, so that this extended use of the notion of 
a labelled class is a harmless convenience. (See Section II.7, p. 147 for a brief dis- 
cussion of alternative but logically equivalent frameworks for the notion of a labelled 
class.) 


Example V1.1. Labelled graphs. By definition, a labelled graph is an undirected graph such that 
distinct integer labels forming an interval of the form {1, 2,...,} are supported by vertices. A 
particular labelled graph of size 4 is for instance 
i==3 
a> (ei 
4—2 
which represents a graph whose vertices bear the labels {1, 2,3, 4} and whose set of edges is 
{{1, 3}, {2,3}, {2,4}, (1,4}}. 


Only the graph structure (as defined by its adjacency structure, i.e., its set of edges) counts, so 
that this is the same abstract graph as in the alternative physical representations 
1—4 3 2: 
ge i N16 all les 
32 1—4 


However, this graph is different from either of 
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There are altogether G4 = 64 = 2 labelled graphs of size 4, i.e., comprising 4 nodes, in 
agreement with the general formula (see p. 105 for details): Gy = 2""—-1)/2 The labelled 
graphs can be grouped into equivalence classes up to arbitrary permutation of the labels, which 
determines the G4 = 11 unlabelled graphs of size 4. Each unlabelled graph corresponds to a 
variable number of labelled graphs: for instance, the totally disconnected graph (bottom, left) 
and the complete graph (top right) correspond to 1 labelling only, while the line graph (top left) 
admits 54! = 12 possible labellings. 


Figure II.1. Labelled versus unlabelled graphs for size n = 4. 


4—1 3— 1 
pee ie ee le 
3—2 4—2 
since, for instance, 1 and 2 are adjacent in # and j, but not in g. Altogether, there are 3 
different labelled graphs (namely, g,h, j), that have the same “shape”, corresponding to the 
single unlabelled quadrangle graph 
ee 
ed le 
ee 
Figure II.1 lists all the 64 labelled graphs of size 4 as well as their 11 unlabelled counterparts 
viewed as equivalence classes of labelled graphs when labels are ignored. ................ | 


In order to count labelled objects, we appeal to exponential generating functions. 


Definition II.2. The exponential generating function (EGF) of a sequence (Aj) is the 
formal power series 


gh 
(1) Aj) =D) An. 


n>0 
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The exponential generating function (EGF) of a class A is the exponential generating 
function of the numbers Ay = card(A,). Equivalently, the EGF of class A is 
zn zal 
A@) = >) An— = lane: 
n! la|! 
n>0 aeA 
It is also said that the variable z marks size in the generating function. 


With the standard notation for coefficients of series, the coefficient A, in an exponen- 
tial generating function is then recovered by” 


An =n! -[z"] A(z), 


since [z”]A(z) = A,/n! by the definition of EGFs and in accordance with the coeffi- 
cient extractor notation, Equation (9), p. 19, in Chapter I. 

Note that, as in the previous chapter, we adhere to a systematic naming convention 
for generating functions of combinatorial structures. A labelled class A, its counting 
sequence (A,) (or (a,)), and its exponential generating function A(z) (or a(z)) are all 
denoted by the same group of letters. As usual, combinatorially isomorphic classes 
(Definition 1.3, p. 19) are freely identified. 


Neutral and atomic classes. As in the unlabelled universe (p. 24), it proves useful 
to introduce a neutral (empty, null) object € that has size 0 and bears no label at all, and 
consider it as a special labelled object; a neutral class E is then by definition € = {e} 
and is also denoted by boldface 1. The (labelled) atomic class Z = {@} is formed of a 
unique object of size | that, being well-labelled, bears the integer label ©. The EGFs 
of the neutral class and the atomic class are, respectively, 


E(z)=1, Z(z) =z. 


Permutations, urns, and circular graphs. These structures, described in Exam- 
ples II.2-II.4, are undoubtedly the most fundamental ones for labelled enumeration. 


Example 1.2. Permutations. The class P of all permutations is prototypical of labelled classes. 
Under the linear representation of permutations, where 


—f 1 2 + on 
OS Oj 02 wae On 
is represented as the sequence (01, 02, ..., On), the class P is schematically 


®-@-® 
@-@-@ 


-3)- 
@-@-O® 
so that Pp = 1, P} = 1, Pp = 2, P3 = 6, etc. There, by definition, all the possible orderings 
of the distinct labels are taken into account, so that the class P can be equivalently viewed as 
the class of all labelled linear digraphs (with an implicit direction, from left to right, say, in the 
representation). Accordingly, the class P of permutations has the counting sequence Py, = n! 


2Some authors prefer the notation [5 1A@) to n![z”]A(z), which we avoid in this book. Indeed, 


Knuth [376] argues convincingly that the variant notation is not consistent with many desirable properties 
of a “good” coefficient operator (e.g., bilinearity). 
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(argument: there are n choices of where to place the element 1, then (n — 1) possible positions 
for 2, and so on). Thus the EGF of P is 


is 1 
P@= y= ye = i 


n>0 n>0 


Permutations, as they contain information relative to the ordering of their elements are essential 
in many applications related to order statistics. 1.2.0... 0... cee cece cence ence teen eens |_| 


Example 11.3. Urns. The class U of totally disconnected graphs starts as 


© ® © ® © .® 
uU=j€,@, 
eg ® ® © ea 


The ordering between the labelled atoms does not matter, so that for each n, there is only one 
possible arrangement and U, = 1. The class U/ can be regarded as the class of urns, where 
an urn of size n contains n distinguishable balls in an unspecified (and irrelevant) order. The 
corresponding EGF is 


gt 
U(z) = a 1 ae exp(z) = e¢. 
n>0 
(The fact that the EGF of the constant sequence (1),,>0 is the exponential function explains the 
term “exponential generating function’’.) It also proves convenient, in several applications, to 
represent elements of an urn in a sorted sequence, which leads to an equivalent representation 
of urns as increasing linear graphs; for instance, 


O-@-@-@-O© 
may be equivalently used to represent the urn of size 5. Though urns look trivial at first glance, 


they are of particular importance as building blocks of complex labelled structures (e.g., alloca- 
tions of various sorts), as we shall see shortly. ....... 0... 0c cee cece e eee eee een ene ee | 


Example V1.4. Circular graphs. _ Finally, the class of circular graphs, in which cycles are 
oriented in some conventional manner (say, positively here) is 


co. 2}. 3 


Circular graphs correspond bijectively to cyclic permutations. One has Cyn = (n — 1)! (argu- 
ment: a directed cycle is determined by the succession of elements that “follow” 1, hence by a 
permutation of n — 1 elements). Thus, one has 


zt zt 1 
C@= > a- NI = > — = log 5 


n>1 n>1 


-—z 


As we shall see in the next section, the logarithm is characteristic of circular arrangements of 
labelledobjects:: 85h steal see ge Re rhe SACO INE LAB IT ee ti MI Let Sede |_| 


> IL1. Labelled trees. Let Un, now be the number of labelled graphs with n vertices that are 
connected and acyclic; equivalently, U, is the number of labelled unrooted non-plane trees. Let 
Tn be the number of labelled rooted non-plane trees. The identity JT, = nUy is elementary, 
since all vertices in a labelled tree are distinguished by their labels and a root can be chosen in n 


ways. In Section IL. 5, p. 125, we shall prove that U, = n"~? and Ty = n"—!. dq 
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II. 2. Admissible labelled constructions 


We now describe a toolkit of constructions that make it possible to build complex 
labelled classes from simpler ones. Combinatorial sum, also known as disjoint union 
is taken in the sense of Chapter I, p. 25: it is the union of disjoint copies. Next, in 
order to define a product adapted to labelled structures, we cannot rely on the carte- 
sian product, since a pair of two labelled objects is not well-labelled (for instance the 
label 1 would invariably appear repeated twice). Instead, we define a new operation, 
the labelled product, which translates naturally into exponential generating functions. 
From here, simple translation rules follow for labelled sequences, sets, and cycles. 


Binomial convolutions. As a preparation to the translation of labelled construc- 
tions, we first briefly review the effect of products over EGFs. Let a(z), b(z), c(z) be 
EGFs, with a(z) = >°,, 4nz"/n!, and so on. The binomial convolution formula is: 


(2) if a(z)=b(z)-c(z), then ap => (fron 
k=0 


where (7) = n!/(k! (1n—k)!) represents, as usual, a binomial coefficient. This formula 


results from the usual product of formal power series, 


= >» Pes ee aad () ope let 
n! nan ! (—k)! k k!(n—k)! 
In the same vein, if a(z) = b\ (z) b® (z) ---b™ (z), then 


n 
3) Le ~S ( ; osro mee 
> r 


nytng+e-+nyp=n N1,N2,... 


In Equation (3) there occurs the multinomial coefficient 


( n ) n!} 
— > 
N1,N2,...,N, ni!n2!---n,! 


which counts the number of ways of splitting n elements into r distinguishable classes 
of cardinalities n;,...,,. This property lies at the very heart of enumerative appli- 
cations of binomial convolutions and EGFs. 


II. 2.1. Labelled constructions. A labelled object may be relabelled. We only 
consider consistent relabellings defined by the fact that they preserve the order rela- 
tions among labels. Then two dual modes of relabellings prove important: 


— Reduction: For a weakly labelled structure of size n, this operation reduces 
its labels to the standard interval [1 ..1] while preserving the relative order 
of labels. For instance, the sequence (7, 3,9, 2) reduces to (3, 2,4, 1). We 
use p(a) to denote this canonical reduction of the structure a. 

— Expansion: This operation is defined relative to a relabelling function e : 
[1..n] }» Z that is assumed to be strictly increasing. To a well-labelled 
object a of size n, it associates a weakly labelled object @, in which label j 
of a is replaced by labelled e(j). For instance, (3, 2,4, 1) may expand as 
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§ 


Figure II.2._ The 10 = 6) elements in the labelled product of a triangle and a segment. 


(33, 22,44, 11), (7,3, 9, 2), and so on. We use e(a) to denote the result of 
relabelling a by e. 


These notions enable us to devise a product well suited to labelled objects, which was 
originally formalized under the name of “partitional product” by Foata [265]. The 
idea is simply to relabel objects, so as to avoid duplicate labels. 

Given two labelled objects 8 € Bandy é C, their labelled product, or simply 
product, denoted by x7, is a set comprised of the collection of well-labelled ordered 
pairs (6’, y’) that reduce to (f, y): 


(4) Buy :={(B.7’) | B’. 7’) is well-labelled, p(f’) = £, p(y’) =» }. 


An equivalent form, via expansion of labels, is 


(5) Bxy ={(e(B), f(y) | Ime)NIm(f) = GB, Im(e)UIm(f) = [1.. 161+ ly 11), 


where e, f are relabelling functions with ranges Im(e), Im(/), respectively. 

Note that elements of a labelled product are, by construction, well-labelled. The 
labelled product (6 x y ) of two elements /, y of respective sizes nj, n2 is a set whose 
cardinality is, with n = n; +2, expressed as 


ny +n _[n 

mina) \ni jy’ 
since this quantity is the number of legal relabellings by expansion of the pair (f, y ). 
(Figure IL.2 displays the 6) = 10 elements of the labelled product of a particular 


object of size 3 with another object of size 2.) The labelled product of classes is then 
defined by the natural extension of operations to sets. 


Definition II.3. The labelled product of B and C, denoted BxC, is obtained by forming 
ordered pairs from B x C and performing all possible order-consistent relabellings. 
In symbols: 


(6) BxC= [J (xy). 
BEB, yeC 
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Equipped with this notion, we can build sequences, sets, and cycles, in a way 
much similar to the unlabelled case. We proceed to do so and, at the same time, 
establish admissibility’ of the constructions. 


Labelled product. When A = B x C, the corresponding counting sequences sat- 
isfy the relation, 


me (Ee S(t) 


n1,"2 
[Al+ly [=a nyrng=n 

The product Bn,Cn, keeps track of all the possibilities for the B and C components 
and the binomial coefficient accounts for the number of possible relabellings, in accor- 
dance with our earlier discussion. The binomial convolution property (7) then implies 
admissibility 

A=BxC = A(z) = B(z)- C(z), 
with the labelled product simply translating into the product operation on EGFs. 


> IL.2. Multiple labelled products. The (binary) labelled product satisfies the associativity 


property, 
BeCxedD) 2 BROXD. 


which serves to define B x C x D. The corresponding EGF is the product B(z) - C(z) - D(z). 
This rule generalizes to r factors with coefficients given by a multinomial convolution (3). << 


k-sequences and sequences. The kth (labelled) power of B is defined as (B x 
B.--B), with k factors equal to B. It is denoted SEQ; (B) as it corresponds to forming 


k-—sequences and performing all consistent relabellings. The (labelled) sequence class 
of B is denoted by SEQ(B) and is defined by 


SEQ(B) := {e} + B+ (B+ B) + (Bx Bx B)+--- = [J SEQ (B). 
k>0 
The product relation for EGFs extends to arbitrary products (Note II.2 above), so that 


A=SEQ,(B) = A) = Bok 
1 


7 7 a 
A=SEQ(B) = A= DO i ESTO 


where the last equation requires By = 9. 


k-sets and sets. We denote by SET; (8) the class of k-sets formed from B. The 
set class is defined formally, as in the case of the unlabelled multiset: it is the quotient 
SET; (B) := SEQ; (B)/R, where the equivalence relation R identifies two sequences 
when the components of one are a permutation of the components of the other (p. 26). 
A “set” is like a sequence, but the order between components is immaterial. The 
(labelled) set construction applied to 6, denoted SET(B), is then defined by 


SET(B) := {e} + B+ SET2(B) +--- = J SET (B). 
k>0 


3We recall that a construction is admissible (Definition L5, p. 22) if the counting sequence of the result 
only depends on the counting sequences of the operands. An admissible construction therefore induces a 
well-defined transformation over exponential generating functions. 
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A labelled k—set is associated with exactly k! different sequences, since all its compo- 
nents are distinguishable by their labels. Precisely, one may choose to identify each 
component in a labelled set or sequence by its “leader”; that is, the value of its small- 
est label. There is then a uniform k!—to—one correspondence between k—sequences 
and k-sets, as illustrated in a particular case (k = 3) by the diagram below: 


In figurative terms: the contents of a bag containing k different items can be laid on a 
table in k! ways. Thus in terms of EGFs, one has, assuming Bo = 9, 


A=SET,(B) => A(z)= a) 
02 1 
A= SET(B) = > A(z) =>) — BG)‘ = exp(B(2)). 
k! 
k=0 
In the unlabelled case, formulae are more complex, since components in multisets 
are not necessarily different. Note also that the distinction between multisets and 
powersets, which is meaningful for unlabelled structures is here immaterial, and we 
have the unlabelled-to-labelled analogy: MSET, PSET ~» SET. 


k-cycles and cycles. We also introduce the class of k—-cycles, CYC; (B) and the 
cycle class. The cycle class is defined formally, as in the unlabelled case, to be the 
quotient Cyc; (B) := SEQ;(B)/S, where the equivalence relation S identifies two 
sequences when the components of one are a cyclic permutation of the components 
of the other (p. 26). A cycle is like a sequence whose components can be cyclically 
shifted, so that there is now a uniform k—to—one correspondence between k—sequences 
and k—cycles. In terms of EGFs, we have (assuming By = 9 and k > 1) 


1 
A=Cycx(B) => Aw) = 7 Be)" 
— 1 1 
A=Cyc(B) = A(z)= > =B(z)* = log ——_., 
(B) (:) di @)* = log 
since each cycle admits exactly k representations as a sequence. In summary: 


Theorem II.1 (Basic admissibility, labelled universe). The constructions of combina- 
torial sum, labelled product, sequence, set, and cycle are all admissible. Associated 
operators on EGFs are: 


Sum: A=B+C => A(z) = B(z)+C(z), 
Product: A=BxC => A(z) = B(z)-C(z), 
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Sequence: A = SEQ(B) = A(z)= TBO’ 
—k components: A= SEQ,(B) = (B)Y** = > A(z) = B(z)‘, 

Set: A = SET(B) => A(z) = exp(B(z)), 
—k components: A= SET;(B) => A(z)= = BOO, 
Cycle: A = Cyc(B) => A(z) = log SE 
—k components: A= CyYcC,;(B) = A(z)= Bt) : 


Constructible classes. As in the previous chapter, we say that a class of labelled 
objects is constructible if it admits a specification in terms of sums (disjoint unions), 
the labelled constructions of product, sequence, set, cycle, and the initial classes de- 
fined by the neutral structure of size 0 and the atomic class Z = {@}. Regarding the 
elementary classes discussed in Section II. 1, it is immediately recognized that 


P=SEQ(Z), U=SET(Z), C=Cyrc(Z), 


specify permutations, urns, and circular graphs, respectively. These classes are basic 
building blocks out of which more complex objects can be constructed. In particular, 
as we shall explain shortly (Section II. 3 and Section II. 4), set partitions (S), surjec- 
tions (7), permutations under their cycle decomposition (P), and alignments (Q) are 
constructible classes corresponding to 


Surjections: R = SEQ(SETs1(Z)) (sequences-of-sets); 
Set partitions: S = SET(SET;(Z))  (sets-of-sets); 
Alignments: O = SEQ(CyYC(Z)) (sequences-of-cycles); 
Permutations: P= SET(CYC(Z)), — (sets-of-cycles). 


An immediate consequence of Theorem II.1 is the fact that a functional equation 
for the EGF of a constructible labelled class can be computed automatically. 
Theorem II.2 (Symbolic method, labelled universe). The exponential generating func- 
tion of a constructible class of labelled objects is a component of a system of generat- 
ing function equations whose terms are built from | and z using the operators 


1 1 
Py. ey TON) i-f? (f) =e (f) CTF 
When we further allow restrictions in composite constructions, the operators f* (for 
SEQ,), f*/k! (for SET;), and f*/k (for CyC,) are to be added to the list. 


II. 2.2. Labelled versus unlabelled enumeration. Any labelled class A has an 
unlabelled counterpart A: objects in A are obtained from objects of A by ignoring 
the labels. This idea is formalized by identifying two labelled objects if there is an 
arbitrary relabelling (not just an order-consistent one, as has been used so far) that 
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transforms one into the other. For an object of size n, each equivalence class contains 
a priori between | and n! elements. Thus: 


Proposition II.1. The counts of a labelled class A and its unlabelled counterpart A 
are related by 


ax a A 
(8) An < An <n!A, orequivalently 1< ee eal 


Example 11.5. Labelled and unlabelled graphs. This phenomenon has been already encoun- 
tered in our discussion of graphs (Figure II.1, p. 97). Let in general Gy, and Gy, be the number 
of graphs of size n in the labelled and unlabelled case, respectively. One finds forn = 1.. 15: 


Gn (unlabelled) Gn (labelled) 
1 1 

2 2 

4 8 

11 64 

34 1024 

156 32768 

1044 2097152 

12346 268435456 

274668 68719476736 
12005168 35184372088832 
1018997864 36028797018963968 
165091172592 | 73786976294838206464 


The sequence (Gn) constitutes EJS 4000088, which can be obtained by an extension of methods 
of Chapter I, p. 85, specifically by Polya theory [319, Ch. 4]. The sequence (G,,) is determined 
directly by the fact that a graph of n vertices can have each of the (5) possible edges either 


present or not, so that 
Gp = 20) = 20@-1)/2, 


The sequence of labelled counts obviously grows much faster than its unlabelled counterpart. 
We may then verify the inequality (8) in this particular case. The normalized ratios, 


Pn -= Gn/Gn, On = Gn/(n!Gn), 


are observed to be 


n_| pn =Gn/Gn on = Gn/(n!Gn) 
1 | 1.000000000 1.0000000000 
2 | 1.000000000 0.5000000000 
3 | 2,000000000 0.3333333333 
4 | 5.818181818 0.2424242424 
6 | 210.0512821 0.2917378918 
8 | 21742.70663 0.5392536367 
12 | 446946830.2 0.9330800361 
16 | 0.2076885783 - 10!4 | 0.9926428522 


From these data, it is natural to conjecture that o,, tends rapidly to 1 as n tends to infinity. This is 
indeed a non-trivial fact originally established by Pélya (see Chapter 9 of Harary and Palmer’s 
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book [319] dedicated to asymptotics of graph enumerations): 


Ge =20 = eal 


ae 
In other words, “almost all” graphs of size n should admit a number of labellings close to n!. 
(Combinatorially, this corresponds to the fact that in a random unlabelled graph, with high 
probability, all of the nodes can be distinguished via the adjacency structure of the graph; in 
such a case, the graph has no non-trivial automorphism and the number of distinct labellings is 
NPCXACHY.) senna eh Ree Yk 8 el FE aS A ae ig Sad es OO Re tet | 


In contrast with the case of all graphs, where Gn ~ Gn /n!, urns (totally discon- 
nected graphs) illustrate the other extreme situation where 


CO. = Ue 


These examples indicate that, beyond the general bounds of Proposition II.1, there 
is no automatic way to translate between labelled and unlabelled enumerations. But 
at least, if the class A is constructible, its unlabelled counterpart A can be obtained 
by interpreting all the intervening constructions as unlabelled ones in the sense of 
Chapter I (with SET +> MSET); both generating functions are computable, and their 
coefficients can then be compared. 

> IL.3. Permutations and their unlabelled counterparts. The labelled class of permutations can 
be specified by P = SEQ(Z); the unlabelled counterpart is the set P of i integers in unary nota- 


tion, and P, = = 1, so that P, = n! Ph exactly. The specification P’ = SET(CYC(Z)) describes 
sets of cycles and, in the labelled universe, one has P’ = P; however, the unlabelled counter- 


part of P’ is the class P’ # P of integer partitions examined in Chapter I. Un the unlabelled 
universe, there are special combinatorial isomorphisms such as SEQ>1)(Z) = MSET31(2Z) = 
Cyc(Z). In the labelled universe, the identity SET o Cyc = SEQ holds.] dq 


II.3. Surjections, set partitions, and words 


This section and the next are devoted to what could be termed level-two non- 
recursive structures defined by the fact that they combine two constructions. In this 
section, we discuss surjections and set partitions (Subsection II. 3.1), which constitute 
labelled analogues of integer compositions and integer partitions in the unlabelled 
universe. The symbolic method then extends naturally to words over a finite alpha- 
bet, where it opens access to an analysis of the frequencies of letters composing words. 
This in turn has useful consequences for the study of classical random allocation prob- 
lems, of which the birthday paradox and the coupon collector problem stand out (Sub- 
section II. 3.2). Figure 1.3 summarizes some of the main enumeration results derived 
in this section. 


II. 3.1. Surjections and set partitions. We examine classes 
R = SEQ(SET>1(Z)) and S = SET(SETs1(Z)), 


corresponding to sequences-of-sets (R) and sets-of-sets (S), or equivalently, sequences 
of urns and sets of urns, respectively. Such abstract specifications model basic objects 
of discrete mathematics, namely surjections (R) and set partitions (S) 
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Specification EGF coefficient 
1 n! 
jections: = SEQ(SETs 1 (Z — ~ ——____ . 109, 2. 
Surjections R = SEQ(SETs 1(Z)) nee Flog 2)" (pp. 109, 259) 
—rimages  R) = SEQ,(SETs\(Z)) (e&& — 1)" rl") (p. 107) 
7 r 
; 
Set partitions: S = SET(SETS (Z)) got ay —" (pp. 109, 560) 
7 (logn)” 
1 
—rblocks SS) = Set, (SETS (2)) eH ("| (p. 108) 
a Te T 


—blocks <b S=SeEt(SET; ,(Z)) e@O-! aw pnQ-1/b) pp. 111, 568 
Words: W = SEQ, (SET(Z)) es r? (p. 112) 


Figure II.3. Major enumeration results relative to surjections, set partitions, and words. 


Surjections with r images. In elementary mathematics, a surjection from a set A 
to a set B is a function from A to B that assumes each value at least once (an onto 
mapping). Fix some integer r > 1 and let RY denote the class of all surjections from 
the set [1 ..] onto [1 ..r] whose elements are also called r—surjections. A particular 
object d € RY is depicted in Figure IL-4. 

We set RM”) = U, RO and proceed to compute the corresponding EGF, R“)(z). 
First, let us observe that an r—surjection ¢ € RO is determined by the ordered r— 
tuple formed with the collection of all preimage sets, (¢! (1), 67!(2),..., ¢'(r)), 
themselves disjoint non-empty sets of integers that cover the interval [1..n]. In the 
case of the surjection ¢ of Figure II.4, this alternative representation is 


p: [{2}, {1,3}, {4,6, 8}, {9}, {5,7}]. 
One has the combinatorial specification and EGF relation: 
(9) R” =SEQ,(V), V=SETs1(Z) = R(z) = (EF - 1)’. 
Here V = U \ {e} designates the class of urns (/) that are non-empty, with EGF 
V(z) =e*—1. In words: “a surjection is a sequence of non-empty sets”. (Figure II.4). 

Expression (9) does solve the counting problem for surjections. For small r, one 
finds 
R(z) = e% —2e2 +1, R®)(z) = e* — 3e% + 3e% — 1, 

whence, by expanding, 

R®=2"-2, RG =3"—3.2" 43. 


The general formula follows similarly from expanding the rth power in (9) by the 
binomial theorem, and then extracting coefficients: 


(0) RP = alle") (“Jini = y (“Jen Cae 


j=o J j=0 
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1 2 3 4 5 6 7 8 9 
1 2 3 4 5 


@M IH © G®D 


[ {2}, tl, 3}, {4, 6, 8}, {9}; {5,7} ] 


Figure 11.4. The decomposition of surjections as sequences-of-sets: a surjection ¢ 
given by its graph (top), its table (second line), and its sequence of preimages (bottom 
lines). 


> IL4. A direct derivation of the surjection EGF. One can verify the result provided by the 
symbolic method by returning to first principles. The preimage of value j by a surjection is a 
non-empty set of some cardinality n; > 1, so that 


(r) i 
11 R,’ = 
( ) 7 Zs Com em) 


(n1,N2,...,Mr) 


the sum being over nj 2 1, ny tno+---+n,- =n. Introduce the numbers V;, := [[n > 1], 
where [| P] is Iverson’s bracket (p. 58). The formula (11) then assumes the simple form 


n 
(12) i= ( vn Vata 
N1,N2,..-,Mr 
N,NQ,...Mr 
where the summation now extends to all tuples (m1, n2,...,;). The EGF of the V, is V(z) = 


> Vaz" /n! = e& — 1. Thus the convolution relation (12) leads again to (9). 


Set partitions into r blocks. Let s© denote the number of ways of partitioning 
the set [1 ..] into r disjoint and non-empty equivalence classes also known as blocks. 
We set S = Un Si); the corresponding objects are called set partitions (the latter 
not to be confused with integer partitions examined in Section I. 3). The enumeration 
problem for set partitions is closely related to that of surjections. Symbolically, a 
partition is determined as a labelled set of classes (blocks), each of which is a non- 
empty urn. Thus, one has 


1 : 
(13) S® = SsT,(V), V = SETS; (2) = SO) = ie =). 
rt 


The basic formula connecting the two counting sequences R® and s© is 
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in accordance with (9) and (13). This can also be interpreted directly: an r—partition is 
associated with a group of exactly r! distinct r—surjections, two surjections belonging 
to the same group iff one is obtained from the other by permuting the range values, 
[1..r]. 

The numbers s© = n![z"]S(z) are known as the Stirling numbers of the sec- 
ond kind, or better, the Stirling partition numbers. They were already encountered in 
connection with encodings by words (Chapter I, p. 62). Knuth, following Karamata, 
advocated for the s© the notation {rh}. From (10), an explicit form also exists: 


ge 
(14) SY = ("| = => (Jere. 
= 


The books by Graham, Knuth, and Patashnik [307] and Comtet [129] contain a thor- 
ough discussion of these numbers; see also Appendix A.8: Stirling numbers, p. 735. 


All surjections and set partitions. Define now the collection of all surjections 
and all set partitions by 


R=|(JR. sa Js” 


Thus 7, is the class of all surjections of [1 ..] onto any initial segment of the inte- 
gers, and S, is the class of all partitions of the set [1 ..m] into any number of blocks 
(Figure II.5). Symbolically, one has 


R = SEQ(SETs1(Z)) = > R(G@)= — 


(15) ae 
S = SET(SETs|(Z)) = S(z)=e*7!. 
The numbers R, = n! [z”]R(z) are called surjection numbers (also, “preferential 


arrangements”, EJS A000670). The numbers S,, are the Bell numbers (EIS A000110). 
These numbers are easily determined by expanding the EGFs: 


a Zz ZA z 2 Zz 
RG) > Reo 1S Bat 1S eal P08 PAS 
2 3 4 5 6 7 
x z z z & z 
SOO Stee i li 52 0s er a 


Explicit expressions as finite double sums result from summing Stirling numbers, 


n n 
m= rf, and = >|". 
r>0 r>0 


where each Stirling number is itself a sum given by (14). Alternatively, single (though 
infinite) sums arise from the expansions 


| 1 el 1 eé 
2 1— fe Pe ee ee 
) and 


CO 
= 1 ot: ey 
_ ett © ea fl 


R(z) 
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CEE CED CD O 
Gap 
Gap 
Gap 
Gap 


Figure II.5. A complete listing of all set partitions for sizes n = 1,2,3,4. The 
corresponding sequence 1, 1, 2,5, 15,...is formed of Bell numbers, E7S A000110. 


from which coefficient extraction yields 


Le Loe 
Ry = 327 and Sy = emer 


The formula for Bell numbers was found by Dobinski in 1877. 

The asymptotic analysis of the surjection numbers (R,,) will be performed in Ex- 
ample IV.7 (p. 259), as one of the very first illustrations of complex asymptotic meth- 
ods (the meromorphic case); that of Bell’s partition numbers is best done by means of 
the saddle-point method (Example VIII.6, p. 560). The asymptotic forms found are 


n) 1 oe 
ae oy (los 2y"+1 and Sn ~n: 2 => 
(log 2) r”./2ar(r + lye 


where r = r(n) is the positive root of the equation re’ = n+ 1. One has r(n) ~ 
logn — log log n, so that 


(16) Rn 


log S, =n (logn — loglogn —1+0())). 


Elementary derivations (i.e., based solely on real analysis) of these asymptotic forms 
are also possible, a fact discussed briefly in Appendix B.6: Laplace’s method, p. 755. 

The line of reasoning adopted for enumerating surjections viewed as sequences- 
of-sets and partitions viewed as sets-of-sets yields a general result that is applicable to 
a wide variety of constrained objects. 


Proposition II.2. The class R“-®) of surjections, where the cardinalities of the 
preimages lie in A C Zs and the cardinality of the range belongs to B, has EGF 


R“P)(z) = Bla(z)) where a) = >) =, A) = Die. 


acA - beB 
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The class S‘-8) of set partitions with block sizes in A C Z> and with a number 
of blocks that belongs to B has EGF 


a 


b 
SOP) = Bal) where a= IT, AO= DF. 


acA beB 


Proof. One has R48) = SEQz(SET4(Z)) and S4-9) = SETg(SET,4(Z)), where, 
in accordance with our general convention of p. 30, the notation Rg specifies a con- 
struction R with a number of components restricted to set Q. a 


Example 11.6. Smallest and largest blocks in set partitions. Let ep(z) denote the truncated 
exponential function, 


oot tae ze rad 
The EGFs 5‘<?) (z) = exp(ep(z) — 1) and ea (z) = exp(e* — ep(z)) correspond to partitions 
with all blocks of size < b and all blocks of size > b, respectively. ...............0.-0005 | 


> IL5. No singletons. The EGF of partitions without singleton parts is e® —!-2. The EGF of 
“double surjections” (each preimage contains at least two elements) is (2 + z — e)71, J 


Example V1.7. Comtet’s square. An exercise in Comtet’s book [129, Ex. 13, p. 225] serves 
beautifully to illustrate the power of the symbolic method. The question is to enumerate set 
partitions such that a parity constraint is satisfied by the number of blocks and/or the number of 
elements in each block. Then, the EGFs are tabulated as follows: 


Set partitions: | Any # of blocks Odd#of blocks Even # of blocks 
any block sizes | e® ~! sinh(e* — 1) cosh(e* — 1) 
odd block sizes | eSmh< sinh(sinh z) cosh(sinh z) 
even block sizes | e°8hz—1 sinh(coshz—1) cosh(coshz — 1) 


The proof is a direct application of Proposition II.2, upon noting that e*, sinh z, cosh z are the 
characteristic EGFs of Z59, 2Z9 + 1, and 2Z 0 respectively. The sought EGFs are then 
obtained by forming the compositions 


exp —1+exp 
sinh to sinh : 
cosh —1+ cosh 
in accordance with general principles. ........... 0... cece cece eee eee eee e tne eene a 


II. 3.2. Applications to words and random allocations. Numerous enumera- 
tion problems present themselves when analysing statistics on letters in words. They 
find applications in the study of random allocations [388] and the design of hashing 
algorithms in computer science [378, 538]. Fix an alphabet 


HX = {a,a2,...,a;} 


of cardinality r, and let W be the class of all words over the alphabet -V, the size of 
a word being its length. A word w € VW, of length n can be viewed as a function 
from [1..] to [1..r], namely the function associating to each position the value of 
the corresponding letter (canonically numbered from 1 tor) in the word. For instance, 
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let ¥ = {a,b, c,d, p,q, r} and take the letters of ¥ canonically numbered as ay = 
a,...,a7 =; for the word w = “abracadabra”, the table giving the position-to- 
letter mapping is 


abracadaboreoa 
123 4 5 6 7 8 9 10 11 f, 
I 2 7 J -3.1 4 2 2 F 1 


which is itself determined by its sequence of preimages: 


a=a, b=a2 C=a3 d=a4 p=as q=a6 f=a7 
————- = -o_—_ o__ a—_ a —_—— 
FAO Seth A2s9.. AShs. Aya Abs Ae: ABS Oks 


This decomposition is the same as the one used for surjections; only, it is no longer 
imposed that all preimages should be non-empty. 
The decomposition based on preimages then gives, with U/ the class of all urns 


(17) WEUu'=SEO,U) = W@=(*)=e%, 


which yields back W, = r”, as was to be expected. In summary: words over an r—ary 
alphabet are equivalent to functions into a set of cardinality r and are described by an 
r-fold labelled product. 

For the situation where restrictions are imposed on the number of occurrences of 
letters, the decomposition (17) generalizes as follows. 


Proposition IL.3. Let W) denote the family of words over an alphabet of cardinal- 
ity r, such that the number of occurrences of each letter lies ina set A. Then 


(18) WA) =al’ where a) => . 


acA 


The proof is a one-liner: W'4) ~ SEQ,(SET4(Z)). Although this result is tech- 
nically a shallow consequence of the symbolic method, it has several important appli- 
cations in discrete probability, as we see next. 


Example 11.8. Restricted words. The EGF of words containing each letter at most b times, and 
that of words containing each letter more than b times are 


(19) WIS) (z) = en(z)", —- W'?®)(z) = (e% — ep)", 


respectively. (Observe the analogy with Example II.6, p. 111.) Taking b = 1 in the first formula 
gives the number of n-arrangements of r elements (i.e., of ordered combinations of n elements 
among r possibilities), 


(20) meena bay =at(") =o = a 
n 
as anticipated; taking b = 0, but now in the second formula, gives back the number of r— 


surjections. For general b, the generating functions of (19) contain valuable information on the 
least frequent and most frequent letter in random words. ................e eee eee ee eee eee | 
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Example 11.9. Random allocations (balls-in-bins model). Throw at random n distinguishable 
balls into m distinguishable bins. A particular realization is described by a word of length n 
(balls are distinguishable, say, as numbers from 1 to 7) over an alphabet of cardinality m (rep- 
resenting the bins chosen). Let Min and Max represent the size of the least filled and most filled 
bins, respectively. Then’, 


* z\m 
P(Max <b} = n!{z"len (=) 
(21) m ae 
P{Min> b} = al [z"] (ea =o (=)) , 
m 
The justification of this formula relies on the easy identity 
1 Zz 
(22) — e"lf@=k"Is (=), 
m m 


and on the fact that a probability is determined as the ratio between the number of favorable 
cases (given by (19)) and the total number of cases (m”). The formulae of (21) lend themselves 
to evaluation using symbolic manipulations systems; for instance, with m = 100 and n = 200, 
one finds, for P(Max = k): 


k P 4 5 6 F 8-9 12 15 20 
P(Max = k))107>° 1.4- 1073 0.17 0.46 0.26 0.07 0.01 9-107-> 2-107" 4.10719 


The values k = 5, 6, 7, 8 concentrate about 99% of the probability mass. 

An especially interesting case is when m and n are asymptotically proportional, that is, 
n/m =a and a lies ina compact subinterval of (0, +00). In that case, with probability tending 
to 1 as n tends to infinity, one has 


; logn 
Min = 0, Max ~ ———. 
log logn 
In other words, there are, almost surely, empty urns (in fact many of them, see Example II.10, 
p. 177) and the most filled urn grows logarithmically in size (Example VIII.14, p. 598). Such 
probabilistic properties are best established by complex analytic methods, whose starting point 
is exact generating function representations such as (19) and (21). They form the core of the 
reference book [388] by Kolchin, Sevastyanov, and Chistyakov. The resulting estimates are in 
turn invaluable in the analysis of hashing algorithms [301, 378, 538] to which the balls-in-bins 
model has been recognized to apply with great accuracy [425]. .............. 00... cece eee | 


> IL.6. Number of different letters in words. The probability that a random word of length n 
over an alphabet of cardinality r contains k different letters is (with {i} a Stirling number) 


(r) 1 (r\{[n 
rah = aaa) {a 


(Choose k letters among r, then split the 1 positions into k distinguished non-empty classes.) 


The quantity po) is also the probability that a random mapping from [1..n] to [1..7r] has an 


image of cardinality k. dq 


> IL.7. Arrangements. An arrangement of size n is an ordered combination of (some) elements 
of [1 ..]. Let A be the class of all arrangements. Grouping together into an urn all the elements 
not present in the arrangement shows that a specification and its companion EGF are [129, p. 75] 


e& 


_ 


A=U+P, U=SE(Z), P=SEQA(Z) => A= 


4We let P(E) represent the probability of an event E and E(X) the expectation of the random vari- 
able X; cf Appendix A.3: Combinatorial probability, p. 727 and Appendix C.2: Random variables, p. 771. 
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The counting sequence Ay = > F0 a starts as 1, 2,5, 16, 65, 326, 1957 (EIS A000522). < 


Birthday paradox and coupon collector problem. The next two examples show 
applications of EGFs to two classical problems of probability theory, the birthday 
paradox and the coupon collector problem. They constitute a neat illustration of the 
fact that the symbolic method may be used to analyse discrete probabilistic models— 
this theme is explored systematically in Chapter III, as regards exact results, and Chap- 
ter IX, which is dedicated to asymptotic laws. 

Assume that there is a very long line of persons ready to enter a very large room 
one by one. Each person is let in and declares her birthday upon entering the room. 
How many people must enter in order to find two that have the same birthday? The 
birthday paradox is the counterintuitive fact that on average a birthday collision is 
likely to take place as early as at time n = 24. Dually, the coupon collector problem 
asks for the average number of persons that must enter in order to exhaust all the 
possible days in the year as birthdates. In this case, the average is the rather large 
number n’ = 2364. (The term “coupon collection” refers to the situation where images 
or coupons of various sorts are inserted in sales items and some premium is given to 
those who succeed in gathering a complete collection.) The birthday problem and 
the coupon collector problem are relative to a potentially infinite sequence of events; 
however, the fact that the first birthday collision or the first complete collection occurs 
at any fixed time n only involves finite events. The following diagram illustrates the 
events of interest: 


n=0 B (ist collision) C (complete collection) 


] ed] gE, 
INJECTIVE SURJECTIVE 


In other words, we seek the time at which injectivity ceases to hold (the first birthday 
collision, B) and the time at which surjectivity begins to be satisfied (a complete col- 
lection, C). In what follows, we consider a year with r days (readers from Earth may 
take r = 365) and let V represent an alphabet with r letters (the days in the year). 


Example 11.10. Birthday paradox. Let B be the time of the first collision, which is a random 
variable ranging between 2 and r + 1 (where the upper bound is derived from the pigeonhole 


principle). A collision has not yet occurred at time n, if the sequence of birthdates £1,..., Bn 
has no repetition. In other words, the function # from [1..n] to Y must be injective; equiva- 
lently, 61,..., By is an n-arrangement of r objects. Thus, we have the fundamental relation 
—1)---(7—n+1 
P{B>n) r(r ) us n ) 
; r 

n! 
(23) = ie 2)" 

r 


a nl (z"}(1 i =)", 


where the second line repeats (20) and the third results from the series transformation (22). 
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The expectation of the random variable B is elementarily 
CO 
(24) E(B) = >) P{B > n}, 
n=0 


this by virtue of a general formula valid for all discrete random variables (Appendix C.2: Ran- 
dom variables, p. 771). From (23), line 1, this gives us a sum expressing the expectation: 
namely, 


: 
rv —1)---@—-n+1) 
(25) E(B)=1+ >" ai 
n=1 
For instance, with r = 365, one finds that the expectation is the rational number, 
12681 - - -06674 
E(B) = ———_ 
5151---0625 


where the denominator comprises as much as 864 digits. 


= 24.61658, 


An alternative form of the expectation is derived from the generating function involved 
in (23), line 3. Let f(z) = >¢,, fnz” be an entire function with non-negative coefficients. Then 
the formula 


_ ioe) 
(26) Ya i e' f(t) dt, 
= 0 
n=0 
a particular case of the Laplace transform, is valid provided either the sum or the integral on 
the right converges. The proof is a direct consequence of the usual Eulerian representation of 


factorials, 
CO 
nl = i e't” dt. 
0 


Applying this principle to (24) with the probabilities given by (23) [third line], one finds 


[o-e) t r 
(27) BB) = | e! (1+*) dt. 
0 r 


Asymptotic analysis can take up from here. The Laplace method? can be applied either 
in its version for discrete sums to (25) or in its version for integrals to (27); see Appendix B.6: 
Laplace’s method, p. 755. Either way provides the estimate 


(28) E(B) = [> a ; +07"), 


as r tends to infinity. In particular, the approximation provided by the first two terms of (28), 
for r = 365, is 24.61/19, which only represents a relative error of 2 - 10-4. See also a sample 
realization in Figure II.6, corresponding to r = 20. The quantity E(B) is related to Ramanujan’s 
Q-function (see Equation (50), p. 130) by E(B) = 1+ Q(r), and we shall examine a global 
way to deal with an entire class of related sums in Example VI.13, p. 416. 

The interest of such integral representations based on generating functions is that they 
are robust: they adjust naturally to many kinds of combinatorial conditions. For instance, the 
same calculations applied to (21) prove the following: the expected time necessary for the 


Knuth [377, Sec. 1.2.11.3] uses this calculation as a pilot example for (real) asymptotic analysis. 
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20; 


15) 


(letter chosen) 10! 


(time of arrival) 


oT 20 40 60 80 


Figure II.6. A sample realization of the “birthday paradox” and “coupon collection” 
with an alphabet of r = 20 letters. The first collision occurs at time B = 6 and the 
collection becomes complete at time C = 87. 


first occurrence of the event “b persons have the same birthday” has expectation given by the 
integral 


Co t r 
(29) I(r, b) = ees-1(*) dt. 
0 r 


(The basic birthday paradox corresponds to b = 2.) The formula (29) was first derived by 
Klamkin and Newman in 1967; their paper [366] shows in addition that 


b T\ i 
I(r,b) ~ Yoir (1+ 5)r ae 
ro b 


once more a consequence of Laplace’s method. The asymptotic form evaluates to 82.87, for 
r = 365 and b = 3, and the exact value of the expectation is 88.73891. Thus three-way 
collisions also tend to occur much sooner than one might think, after about 89 persons on 
average. Globally, such developments illustrate the versatility of the symbolic approach and its 
applicability to many basic probabilistic problems (see also Subsection III. 6.1, p. 189). ... 


> IL.8. The probability distribution of time till a birthday collision. Elementary approximations 
show that, for large r, and in the “central” regime n = t,/r, one has 


P(B>tyr)~et/2, = P(B=tyr) ~ ae 
r 


The continuous probability distribution with density te~ /2 is called a Rayleigh distribution. 
Saddle-point methods (Chapter VIII) may be used to show that for the first occurrence of a 


b-fold birthday collision: P(B > rr!-1/>) ~ e=?/b!, < 


Example 11.11. Coupon collector problem. This problem is dual to the birthday paradox. We 
ask for the first time C when £1, ..., 8c contains all the elements of V: that is, all the possible 
birthdates have been “collected”. In other words, the event {C < n} means the equality between 
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sets, {f1,..., Bn} = ¥. Thus, the probabilities satisfy 
Rye ee 


rn rh 


OP) ae = iy 
rn - 
= nl[z"] (e%/” — 1) ; 


by our earlier enumeration of surjections. The complementary probabilities are then 


P{C <n} 


P{C > n} =1—P{(C <n} =n[z"] (e = (e/" _ 1)’) 


An application of the Eulerian integral trick of (27) then provides a representation of the expec- 
tation of the time needed for a full collection as 


CO 
(31) E(C) ai (i aij =e") dt. 
0 
A simple calculation (expand by the binomial theorem and integrate termwise) shows that 


r j—] 
BO) =r > (j\—. 


j=l : 


which constitutes a first answer to the coupon collector problem in the form of an alternating 
sum. Alternatively, in (31), perform the change of variables v = 1 — e~*/", then expand and 
integrate termwise; this process provides the more tractable form 


(32) E(C) =rH,r, 
where H,. is the harmonic number: 

1 1 1 
(33) ee ae a 


Formula (32) is by the way easy to interpret directly®: one needs on average 1 = r/r trials to 
get the first day, then r/(r — 1) to get a different day, etc. 

Regarding (32), one has available the well-known formula (by comparing sums with inte- 
grals or by Euler-Maclaurin summation), 


1 
Hr =logr+y + 5 + O(r77),  y = 0.57721 56649, 
r 


where y is known as Euler’s constant. Thus, the expected time for a full collection satisfies 


1 
(34) B(C) =rlogr+yr+5+ o(r7!). 


Here the “surprise” lies in the nonlinear growth of the expected time for a full collection. For a 
year on Earth, r = 365, the exact expected value is = 2364.64602 whereas the approximation 
provided by the first three terms of (34) yields 2364.64625, representing a relative error of only 
one in ten million. 

As usual, the symbolic treatment adapts to a variety of situations, for instance, to multiple 
collections. One finds: the expected time till each item (birthday or coupon) is obtained b times 


J(,b) = i (1 cS (1 = epitt/rjet!")") dt. 


is 


Such elementary derivations are very much problem specific: contrary to the symbolic method, they 
do not usually generalize to more complex situations. 
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This expression vastly generalizes the standard case (31), which corresponds to b = 1. From it, 
one finds [454] 


J(r, b) =r (dogr + (b— 1) loglogr + y — log(b — 1)!+ 0(1)), 
so that only a few more trials are needed in order to obtain additional collections. ......... | 


> IL9. The little sister. The coupon collector has a little sister to whom he gives his duplicates. 
Foata, Lass, and Han [266] show that the little sister misses on average H; coupons when her 
big brother first obtains a complete collection. dq 


> IL.10. The probability distribution of time till a complete collection. The saddle-point method 
(Chapter VIII) may be used to prove that, in the regime n = r logr + tr, we have 

lim P(C <r tr) =e" 

jim, (C <rlogr+tr)=e 


This continuous probability distribution is known as a double exponential distribution. For the 
time C©) till a collection of multiplicity b, one has 


im P(C® <rlogr + (b— 1)rloglogr + tr) = exp(—e~' /(b — 1)!), 

200 

a property known as the Erdés—Rényi law, which finds application in the study of random 

graphs [195]. J 
Words as both labelled and unlabelled objects. What distinguishes a labelled 

structure from an unlabelled one? There is nothing intrinsic there, and everything is in 

the eye of the beholder—or rather in the type of construction adopted when modelling 

a specific problem. Take the class of words W over an alphabet of cardinality r. The 

two generating functions (an OGF and an EGF respectively), 


Be 1 
W(z) = > Wz" = 
n 


l-—rz 


n 
and WO) =>) Wn =e, 
n 


leading in both cases to W, = r”, correspond to two different ways of constructing 
words: the first one directly as an unlabelled sequence, the other as a labelled power of 
letter positions. A similar situation arises for r—partitions, for which we find as OGF 
and EGF, 
aN.) z fps. 1 
oO Gopi. See 
by viewing these either as unlabelled structures (an encoding via words of a regular 
language in Section I. 4.3, p. 62) or directly as labelled structures (this chapter, p. 108). 


> IL.11. Balls switching chambers: the Ehrenfest” model. Consider a system of two cham- 
bers A and B (also classically called “urns”). There are N distinguishable balls, and, initially, 
chamber A contains them all. At any instant 4, 3, ..., one ball is allowed to change from one 
chamber to the other. Let ae be the number of possible evolutions that lead to chamber A 
containing @ balls at instant n and E (¢] (z) the corresponding EGF. Then 


EM4(z) = ic) (cosh z)*(sinhz)%~", — EWY1(z) = (coshz)" = 27% (e% + eZ). 


[Hint: the EGF E!™] enumerates mappings where each preimage has an even cardinality.] In 
particular the probability that urn A is again full at time 27 is 


1 N 


N 2n 
sym 2 (je -29 


II. 4. ALIGNMENTS, PERMUTATIONS, AND RELATED STRUCTURES 119 


This famous model was introduced by Paul and Tatiana Ehrenfest [188] in 1907, as a simplified 
model of heat transfer. It helped resolve the apparent contradiction between irreversibility in 
thermodynamics (the case N — oo) and recurrence of systems undergoing ergodic transforma- 
tions (the case N < oo). See especially Mark Kac’s discussion [361]. The analysis can also 
be carried out by combinatorial methods akin to those of weighted lattice paths: see Note V.25, 
p. 336 and [304]. J 


II. 4. Alignments, permutations, and related structures 


In this section, we start by considering specifications built by piling up two con- 
structions, sequences-of-cycles and sets-of-cycles respectively. They define a new 
class of objects, alignments, while serving to specify permutations in a novel way. 
(These specifications otherwise parallel surjections and set partitions.) In this context, 
permutations are examined under their cycle decomposition, the corresponding enu- 
meration results being the most important ones combinatorially (Subsection I. 4.1 and 
Figure II.8, p. 123). In Subsection II. 4.2, we recapitulate the meaning of classes that 
can be defined iteratively by a combination of any two nested labelled constructions. 


II. 4.1. Alignments and permutations. The two specifications under consider- 
ation now are 


(35) O = SEQ(CYC(Z)), and P = SET(CYC(Z)), 
specifying new objects called alignments (O) as well as an important decomposition 
of permutations (P). 


Alignments. An alignment is a well-labelled sequence of cycles. Let O be the 
class of all alignments. Schematically, one can visualize an alignment as a collection 
of directed cycles arranged in a linear order, somewhat like slices of a sausage fastened 
on a skewer: 


The symbolic method provides, 


1 
O = SEQ(CYC(Z = O(z) = ————___., 
Q(Cyc(Z)) (z) iti =p 
and the expansion starts as 
2 3 4 5 
Zz Zz Zz Zz 
O@) =1+z2+35, +145 +887, OTe Honea. 


but the coefficients (see EJS A007840: “ordered factorizations of permutations into 
cycles”) appear to admit no simple form. 
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(4) (55) G) (14) 2) 
®Q  @) 
a 
A permutation may be viewed as a set of cycles that are labelled circular digraphs. The diagram 
shows the decomposition of the permutation 


=(G 23 4 5 6 wos h6 378 7 5 te) 


11 12 13 17 10 15 1493 4 6 2 7 8 1 5 16 


(Cycles here read clockwise and i is connected to o; by an edge in the graph.) 


Figure II.7._ The cycle decomposition of permutations. 


Permutations and cycles. From elementary mathematics, it is known that a per- 
mutation admits a unique decomposition into cycles. Let o be a permutation. Start 
with any element, say 1, and draw a directed edge from | to o (1), then continue con- 
necting to ¢*(1), ¢3(1), and so on; a cycle containing 1 is obtained after at most n 
steps. If one repeats the construction, taking at each stage an element not yet con- 
nected to earlier ones, the cycle decomposition of the permutation o is obtained; see 
Figure II.7. This argument shows that the class of sets-of-cycles (corresponding to P 
in (35)) is isomorphic to the class of permutations as defined in Example II.2, p. 98: 


(36) P = SET(CYC(Z)) = SEQ(Z). 


This combinatorial isomorphism is reflected by the obvious series identity 


1 1 
P(z) = exp { log —— } = ; 
1—z 1—-z 


The property that exp and log are inverse of one another is nothing but an analytic 
reflex of the combinatorial fact that permutations uniquely decompose into cycles! 

As regards combinatorial applications, what is especially fruitful is the variety of 
special results derived from the decomposition of permutations into cycles. By a use 
of restricted construction that entirely parallels Proposition II.2, p. 110, we obtain the 
following statement. 


Proposition II.4. The class P“-®) of permutations with cycle lengths in A C Zs 
and with cycle number that belongs to B C Zso0 has EGF 


P“-P)(z) = Bla(z)) where —a(z) = SE BQ)= > Pre 


acA ben 
> IL12. What about alignments? With similar notations, one has for alignments 
a 
(@=A@@) where a)= D) —, f@) = Die’, 
acA beB 
corresponding to O'4-8) = SEQR (CYC, (2Z)). dq 
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Example 11.12. Stirling cycle numbers. The class P) of permutations that decompose into r 
cycles, satisfies 


‘ 
(37) PY) =Set,(Cyc(Z)) = PM) = = (ios ; ) 
1 faa Ze 


The number of such permutations of size n is then 


! 1 J 
(38) Py = = [2"] (ios ) ; 
r} 1-z 


These numbers are fundamental quantities of combinatorial analysis. They are known as the 
Stirling numbers of the first kind, or better, according to a proposal of Knuth, the Stirling cycle 
numbers. Together with the Stirling partition numbers, the properties of the Stirling cycle num- 
bers are explored in the book by Graham, Knuth, and Patashnik [307] where they are denoted 
by [7]. See Appendix A.8: Stirling numbers, p. 735. (Note that the number of alignments 
formed with r cycles is r![F]) As we shall see shortly (p. 140) Stirling numbers also surface in 
the enumeration of permutations by their number of records. 

It is also of interest to determine what happens regarding cycles in a random permutation of 
size n. Clearly, when the uniform distribution is placed over all elements of P,,, each particular 
permutation has probability exactly 1/n!. Since the probability of an event is the quotient of 
the number of favorable cases over the total number of cases, the quantity 


- l[n 
Pnyk -= nllk 


is the probability that a random element of P, has k cycles. This probabilities can be effectively 
determined for moderate values of n from (38) by means of a computer algebra system. Here 
are for instance selected values for n = 100: 


k 1 2 3 4 5 6 7 8 9 10 
Pn,k 0.01 0.05 0.12 0.19 0.21 0.17 O11 0.06 0.03 0.01 


For this value n = 100, we expect in a vast majority of cases the number of cycles to be in the 
interval [1, 10]. (The residual probability is only about 0.005.) Under this probabilistic model, 
the mean is found to be about 5.18. Thus: A random permutation of size 100 has on average a 
little more than 5 cycles; it rarely has more than 10 cycles. 

Such procedures demonstrate a direct exploitation of symbolic methods. They do not 
however tell us how the number of cycles could depend on n, as n increases unboundedly. Such 
questions are to be investigated systematically in Chapters III and IX. Here, we shall content 
ourselves with a brief sketch. First, form the bivariate generating function, 


(oe) 
PEw2= > Pe Or, 
r=0 
and observe that 


OO ait l r 
P(z,u) = >s ay (108 —) = exp (« log 


r=0 


1 
=(1-z)™". 
i_ -) (1 —z) 
Newton’s binomial theorem then provides 


ita =i)" (7) 
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In other words, a simple formula 
“fn 
k 
(39) AE u(u + 1)(u+2)---(ut+n—1) 
k=0 
encodes precisely all the Stirling cycle numbers corresponding to a fixed value of n. From here, 
the expected number of cycles, “yn := 5°, kPn,x is easily found to be expressed in terms of 
harmonic numbers (use logarithmic differentiation of (39)): 


1 1 
Hn =Hn =1+—-4+---+-. 
2: n 


In particular, one has “199 = Hjg9 = 5.18738. In general: The mean number of cycles in a 
random permutation of size n grows logarithmically with n, un ~ logn. «0.0.6... 0. eee a 


Example 11.13. Involutions and permutations without long cycles. A permutation o is an 
involution if o? = Id, with Id the identity permutation. Clearly, an involution can have only 
cycles of sizes 1 and 2. The class Z of all involutions thus satisfies 


2 
(40) T = SET(CYC 2(Z)) = I(z)= o(: mn =) 
The explicit form of the EGF lends itself to expansion, 
[n/2] nl 
(ee ee 
‘. 2 (n — 2k)!2kk! 


which solves the counting problem explicitly. A pairing is an involution without a fixed point. 
In other words, only cycles of length 2 are allowed, so that 


J =SEV(CYC(Z) = Ise /2, Ing =1-3-5---Qn—1). 


(The formula for Jn, hence that of J;, can be checked by a direct reasoning.) 
Generally, the EGF of permutations, all of whose cycles (in particular the largest one) have 
length at most equal to r, satisfies 
r zi 
BO) = exp ( >) = 
J 


j=l 
The numbers of? = [z"]B (z) satisfy the recurrence 


(n+ 1b), =@+ DY —H?,, 
by which they can be computed quickly, while they can be analysed asymptotically by means of 
the saddle-point method (Chapter VIII, p. 568). This gives access to the statistics of the longest 


cycle in'a permutation. sane ahh ii ws ee eee ae eee Tew ROR nl cas | 


Example 11.14. Derangements and permutations without short cycles. Classically, a derange- 
ment is defined as a permutation without fixed points, i.e., oj 4 i for all i. Given an integer 
r, an r—derangement is a permutation all of whose cycles (in particular the shortest one) have 
length larger than r. Let D“) be the class of all r—derangements. A specification is 


(41) D”) = Set(Cycs,;(Z)), 
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Specification EGF coefficient 
1 
Permutations: SEQ(Z) n! (p. 104) 
1 1 J n 
r cycles SET; (CYC(Z)) — | log (p. 121) 
r! 1-z r 
involutions — SET(CYcy__9(Z)) et? /2 ~ n/2 (pp. 122, 558) 


: 
allcycles<r  SET(CYC;__,(Z)) exp ( cape ) ~ni-l/r (pp. 122, 568) 


—Z 
derangements SET(CYC,1(Z)) - ~nilew! (pp. 122, 261) 
=z 
exp (-7 ed z 
allcycles>r  SET(CYC3;(Z)) i ~ nle~ Hr (pp. 123, 261) 
=z 


Figure II.8. A summary of permutation enumerations. 


the corresponding EGF then being 


42 D(z) = _)= 
(42) (2) = exp | DU = i 
j>r 
For instance, when r = 1, a direct expansion yields 
(1) n 
D 1 1 -1 

pee  .  E. 
n!} 1! 2! n!} 


a truncation of the series expansion of exp(—1) that converges rapidly to e—!. Phrased differ- 
ently, this becomes a famous combinatorial problem with a pleasantly quaint nineteenth-century 
formulation [129]: “A number n of people go to the opera, leave their hats on hooks in the cloak- 
room and grab them at random when leaving; the probability that nobody gets back his own hat 
is asymptotic to 1/e, which is nearly 37%.” The usual proof uses inclusion—exclusion; see Sec- 
tion III. 7, p. 198 for both the classical and symbolic arguments. (It is a sign of changing times 
that Motwani and Raghavan [451, p. 11] describe the problem as one of sailors that return to 
their ship in a state of inebriation and choose random cabins to sleep in.) 

For the generalized derangement problem, we have, for any fixed r (with H; a harmonic 
number, p. 117), 


(r) 
D 
(43) ees 
n!} 
which is proved easily by complex asymptotic methods (Chapter IV, p. 261). ............. | 


Similar to several other structures that we have been considering previously, per- 
mutation allow for transparent connections between structural constraints and the 
forms of generating functions. The major counting results encountered in this sec- 
tion are summarized in Figure IT.8. 


124 II. LABELLED STRUCTURES AND EGFS 


[> IL13. Permutations such that of = Id. Such permutations are “roots of unity” in the 
symmetric group. Their EGF is 
d 
z 
xe (X=). 
d\f 
where the sum extends to all divisors d of f. dq 


> IL14. Parity constraints in permutations. The EGFs of permutations having only even-size 
cycles or odd-size cycles (O(z)) are, respectively, 


1 1 1 1 1+z 1+z 
B(@) =exp (5 108 —) = =e 0 =ex (F108 **) = =. 


ae’ 6 


One finds Ex, = (1-3-5---(2n —1))? and On = Eon, Orne] = (20 + 1)E dn. 
The EGFs of permutations having an even number of cycles (E*(z)) and an odd number 
of cycles (O*(z)) are, respectively, 


1 1 1— 


E*(z) =cosh{lo peeps gas O*(z) = sinh{lo: ane 
ax Tae tieg) Bee ee aes 


1 1 prea 
21-z aret 


so that parity of the number of cycles is evenly distributed among permutations of size n as soon 
as n > 2. The generating functions obtained in this way are analogous to the ones appearing in 
the discussion of “Comtet’s square’, p. 111. 


> IL15. A hundred prisoners I. This puzzle originates with a paper of Gal and Miltersen [275, 
612]. A hundred prisoners, each uniquely identified by a number between 1 and 100, have 
been sentenced to death. The director of the prison gives them a last chance. He has a cabinet 
with 100 drawers (numbered 1 to 100). In each, he’ll place at random a card with a prisoner’s 
number (all numbers different). Prisoners will be allowed to enter the room one after the other 
and open, then close again, 50 drawers of their own choosing, but will not in any way be allowed 
to communicate with one another afterwards. The goal of each prisoner is to locate the drawer 
that contains his own number. If all prisoners succeed, then they will all be spared; if at least 
one fails, they will all be executed. 

There are two mathematicians among the prisoners. The first one, a pessimist, declares 


that their overall chances of success are only of the order of 1/ 2100 = g.1073!. The second 
one, a combinatorialist, claims he has a strategy for the prisoners, which has a greater than 30% 
chance of success. Who is right? [Note III.10, p. 176 provides a solution, but our gentle reader 
is advised to reflect on the problem for a few moments, before she jumps there.] dq 


II. 4.2. Second-level structures. Consider the three basic constructors of labelled 
sequences (SEQ), sets (SET), and cycles (CYC). We can play the formal game of ex- 
amining what the various combinations produce as combinatorial objects. Restricting 
attention to superpositions of two constructors (an external one applied to an internal 
one) gives nine possibilities summarized by the table of Figure IT.9. 

The classes of surjections, alignments, set partitions, and permutations appear 
naturally as SEQo SET, SEQo Cyc, SETOSET, and SEToCYC (top right corner). 
The others represent essentially non-classical objects. The case of the class £ = 
SEQ(SEQ;1(Z)) describes objects that are (ordered) sequences of linear graphs; this 
can be interpreted as permutations with separators inserted, e.g, 53|264|1, or alterna- 
tively as integer compositions with a labelling superimposed, so that L, = n!2"7!. 
The class F = SET(SEQs1(Z)) corresponds to unordered collections of permuta- 
tions; in other words, “fragments” are obtained by breaking a permutation into pieces 
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ext.\ int. SEQ31 SETS] Cyc 
Labelled compositions (£) — Surjections (R) Alignments (O) 
SEQ SEQ o SEQ SEQ o SET SEQoCYC 
1—z 1 1 
1—2z 2-e& 1—log(i —z)7! 
Fragmented permutations (F) Set partitions (S) Permutations (P) 
SET SET 0 SEQ SET o SET SEToCyYc 
ee/(I-2) eer! : 
1-z 
Supernecklaces (S ty Supernecklaces (S//) Supernecklaces (S///) 
CyYc CYCoSEQ CyYCo SET CycoCyc 
1-z 1 
] —e)7l log ————————— 
8 T=, log(2 — e*) ®t togl — = 


Figure II.9. The nine second-level structures. 


(pieces must be non-empty for definiteness). The interesting EGF is 


2 3 4 


_ z/(—z) _ z s ROSE ae 
F(z) =e Shots iat 18a, + : 


(EIS 4000262: “sets of lists”). The corresponding asymptotic analysis serves to illus- 
trate an important aspect of the saddle-point method in Chapter VIII (p. 562). What we 
termed “supernecklaces” in the last row represents cyclic arrangements of composite 
objects existing in three brands. 

All sorts of refinements, of which Figures II.8 and II.9 may give an idea, are 
clearly possible. We leave to the reader’s imagination the task of determining which 
among the level 3 structures may be of combinatorial interest. . . 


> IL.16. A meta-exercise: Counting specifications of level n. The algebra of constructions sat- 
isfies the combinatorial isomorphism SET(CYC(4)) = SEQ(X) for all 4. How many different 
terms involving n constructions can be built from three symbols CYC, SET, SEQ satisfying a 
semi-group law (“o”) together with the relation SET o CYC = SEQ? This determines the num- 
ber of specifications of level n. [Hint: the OGF is rational as normal forms correspond to words 
with an excluded pattern. ] dq 


II.5. Labelled trees, mappings, and graphs 


In this section, we consider labelled trees as well as other important structures that 
are naturally associated with them. As in the unlabelled case considered in Section I. 6, 
p. 83, the corresponding combinatorial classes are inherently recursive, since a tree is 
obtained by appending a root to a collection (set or sequence) of subtrees. From here, 
it is possible to build the “functional graphs” associated to mappings from a finite set 
to itself—these decompose as sets of connected components that are cycles of trees. 
Variations of these construction finally open up the way to the enumeration of graphs 
having a fixed excess of the number of edges over the number of vertices. 
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a's 


& (3,2,5, 1,7, 4, 6) 


Figure II.10. A labelled plane tree is determined by an unlabelled tree (the “shape’’) 
and a permutation of the labels 1,..., 7. 


II. 5.1. Trees. The trees to be studied here are labelled, meaning that nodes bear 
distinct integer labels. Unless otherwise specified, they are rooted, meaning as usual 
that one node is distinguished as the root. Labelled trees, like their unlabelled coun- 
terparts, exist in two varieties: (i) plane trees where an embedding in the plane is 
understood (or, equivalently, subtrees dangling from a node are ordered, say, from 
left to right); (7) non-plane trees where no such embedding is imposed (such trees 
are then nothing but connected undirected acyclic graphs with a distinguished root). 
Trees may be further restricted by the additional constraint that the nodes’ outdegrees 
should belong to a fixed set Q C Zso where Q 5 0. 


Plane labelled trees. We first dispose of the plane variety of labelled trees. Let 
A be the set of (rooted labelled) plane trees constrained by Q. This family is 


A = Z x SEQgQ(A), 


where Z represents the atomic class consisting of a single labelled node: Z = {1}. 
The sequence construction appearing here reflects the planar embedding of trees, as 
subtrees stemming from a common root are ordered between themselves. Accord- 
ingly, the EGF A(z) satisfies 


A(z) = z@(A(z)) where $(u) = Dou”. 
acQ 
This is exactly the same equation as the one satisfied by the ordinary GF of Q- 
restricted unlabelled plane trees (see Proposition I.5, p. 66). Thus, An is the number 
of unlabelled trees. In other words: in the plane rooted case, the number of labelled 
trees equals n! times the corresponding number of unlabelled trees. As illustrated by 
Figure II.10, this is easily understood combinatorially: each labelled tree can be de- 
fined by its “shape” that is an unlabelled tree and by the sequence of node labels where 
nodes are traversed in some fixed order (preorder, say). In a way similar to Proposi- 
tion 1.5, p. 66, one has, by Lagrange inversion (Appendix A.6: Lagrange Inversion, 
p. 732): 
An = al[z"JA(z) = @ — Diu" "16". 
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+= —w—p 
+= —nw—w 


Figure IL.11. There are T; = 1, T> = 2, T; = 9, and in general T, = n"—! Cayley 
trees of size n. 


This simple analytic-combinatorial relation enables us to transpose all of the enumer- 
ation results of Subsection I. 5.1, p. 65, to plane labelled trees, upon multiplying the 
evaluations by n!, of course. In particular, the total number of “general” plane labelled 
trees (with no degree restriction imposed, i.e., Q = Zs) is 
1 /2n —2 (2n — 2)! 
nlx — = ———_ 
n\n-1 (n— 1)! 
The corresponding sequence starts as 1, 2, 12, 120, 1680 and is EJS A001813. 
Non-plane labelled trees. We next turn to non-plane labelled trees (Figure II.11) 
to which the rest of this section will be devoted. The class 7 of all such trees is 
definable by a symbolic equation, which provides an implicit equation satisfied by the 
EGF: 


(44) T = ZxSET(T) — T(z) = ze?™, 

There the set construction translates the fact that subtrees stemming from the root are 
not ordered between themselves. From the specification (44), the EGF T(z) is defined 
implicitly by the “functional equation” 

(45) T(z) = ze? ®, 

The first few values are easily found, for instance by the method of indeterminate 
coefficients: 


= 2"-1(1.3...(2n —3)). 


2 3 
Zz Zz 
P@)=z+25, +95 


As suggested by the first few coefficients(9 = 37,64 = 47,625 = 5%), the general 
formula is 


4 5 
z Zz 
OAT 026 Bs 


(46) T, =n"! 
which is established (as in the case of plane unlabelled trees) by Lagrange inversion: 
1 
(47) T, =n! [z"]T(z) =n! (wey) an: 
n 
The enumeration result T,, = n”—! is a famous one, attributed to the prolific 


British mathematician Arthur Cayley (1821-1895) who had keen interest in com- 
binatorial mathematics and published altogether over 900 papers and notes. Con- 
sequently, formula (46) given by Cayley in 1889 is often referred to as “Cayley’s 
formula” and unrestricted non-plane labelled trees are often called “Cayley trees”. 
See [67, p. 51] for a historical discussion. The function T(z) is also known as the 
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(Cayley) “tree function”; it is a close relative of the W-function [131] defined implic- 
itly by We” = z, which was introduced by the Swiss mathematician Johann Lambert 
(1728-1777) otherwise famous for first proving the irrationality of the number z. 


A similar process gives the number of (non-plane rooted) trees where all out- 
degrees of nodes are restricted to lie in a set Q. This corresponds to the specification 


7) SF ge SET9(T™) => TOR (z) = zo(T (z)), o(u) — » a 
o6Q 


What the last formula involves is the “exponential characteristic” of the degree se- 
quence (as opposed to the ordinary characteristic, in the planar case). It is once more 
amenable to Lagrange inversion. In summary: 


Proposition IL.5. The number of rooted non-plane trees, where all nodes have outde- 
gree in Q, is 


T°) = (n—1)iu""1G@))" where Fw) = 


oEQ 


In particular, when all node degrees are allowed, i.e., when Q. = Zso0, the number of 
trees is Ty, =n" and its EGF is the Cayley tree function satisfying T (z) = ze? ®. 


As in the unlabelled case (p. 66), we refer to a class of labelled trees defined by 
degree restrictions as a simple variety of trees: its EGF satisfies an equation of the 


form y = z@(y). 

> IL.17. Priifer’s bijective proofs of Cayley’s formula. The simplicity of Cayley’s formula calls 
for a combinatorial explanation. The most famous one is due to Priifer (in 1918). It establishes 
as follows a bijective correspondence between unrooted Cayley trees whose number is n"—? for 
size n and sequences (aj,...,4,—2) with 1 < aj < n for each j. Given an unrooted tree r, 
remove the endnode (and its incident edge) with the smallest label; let aj denote the label of 
the node that was joined to the removed node. Continue with the pruned tree r’ to get az ina 
similar way. Repeat the construction of the sequence until the tree obtained only consists of a 
single edge. For instance: 


3 2 
(4 8 
5 + (4,8,4,8,8,4). 
7 6 
It can be checked that the correspondence is bijective; see [67, p. 53] or [445, p. 5]. dq 


[> IL18. Forests. The number of unordered k—forests (i.e., kK-sets of trees) is 
k ! 
(k) _ et L @) _ (v—1)! n—kyy,uyn _ n—1 n—-k 
fy Se ee IO aa ae 
as follows from Biirmann’s form of Lagrange inversion, relative to powers (p. 66). dq 


> IL.19. Labelled hierarchies. The class L of labelled hierarchies is formed of trees whose 
internal nodes are unlabelled and are constrained to have outdegree larger than 1, while their 
leaves have labels attached to them. As for other labelled structures, size is the number of labels 
(internal nodes do not contribute). Hierarchies satisfy the specification (compare with p. 72) 


L=Z+SeT32(f), = Leer oti. 
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7 yo 


23 


Figure I1.12. A functional graph of size n = 26 associated to the mapping g such 
that g(1) = 16, g(2) = g(3) = 11, (4) = 23, and so on. 


This happens to be solvable in terms of the Cayley function: L(z) = T (5e/ oie 2) + 4 - 


5. The first few values are 0, 1, 4, 26, 236, 2752 (EIS A000311): these numbers count phylo- 


genetic trees, used to describe the evolution of a genetically-related group of organisms, and 
they correspond to Schréder’s “fourth problem” [129, p. 224]. The asymptotic analysis is done 
in Example VII.12, p. 472. 


The class of binary (labelled) hierarchies defined by the additional fact that internal nodes 
can have degree 2 only is expressed by 


M=Z+SETR)(M) = £M(@)=1-V1—2z and My =1-3---(2n—3), 


where the counting numbers are now, surprisingly perhaps, the odd factorials. dq 


II. 5.2. Mappings and functional graphs. Let F be the class of mappings (or 
“functions”) from [1 ..n] to itself. A mapping f € [1..n] b [1..n] can be repre- 
sented by a directed graph over the set of vertices [1 ..n] with an edge connecting x 
to f(x), for all x € [1..n]. The graphs so obtained are called functional graphs and 
they have the characteristic property that the outdegree of each vertex is exactly equal 
to 1. 


Mappings and associated graphs. Given a mapping (or function) f, upon start- 
ing from any point xo, the succession of (directed) edges in the graph traverses the 
vertices corresponding to iterated values of the mapping, 


xo, fo), f(F@o)),---- 


Since the domain is finite, each such sequence must eventually loop back on itself. 
When the operation is repeated, starting each time from an element not previously hit, 
the vertices group themselves into (weakly connected) components. This leads to a 
valuable characterization of functional graphs (Figure II.12): a functional graph is a 
set of connected functional graphs; a connected functional graph is a collection of 
rooted trees arranged in a cycle. (This decomposition is seen to extend the decom- 
position of permutations into cycles, p. 120.) 
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Thus, with T being as before the class of all Cayley trees, and with K the class of 
all connected functional graphs, we have the specification: 


F = SET(K) EQ)narige@ 
1 
(48) K = Cyc(T) => KZ) = lS FH 
T = ZxSET(T) T(z) = zet®, 


What is especially interesting here is a specification binding three types of related 
structures. From (48), the EGF F(z) is found to satisfy F = (1 — T)~!. It can be 
checked from this, by Lagrange inversion once again (p. 733), that we have 


(49) Fy, =n", 


as was to be expected (!) from the origin of the problem. More interestingly, Lagrange 
inversion also gives the number of connected functional graphs (expand log(1 — T)~! 
and recover coefficients by Btirmann’s form, p. 66): 


| _ n—-1 (n — 1)(n — 2) 
(50) K,=n"°Q(n) where Q(n):=1+ - + 5 Beg 


n 


The quantity Q(n) that appears in (50) is a famous one that surfaces in many prob- 
lems of discrete mathematics (including the birthday paradox, Equation (27), p. 115). 
Knuth has proposed naming it “Ramanujan’s Q-function” as it already appears in the 
first letter of Ramanujan to Hardy in 1913. The asymptotic analysis is elementary 
and involves developing a continuous approximation of the general term and approx- 
imating the resulting Riemann sum by an integral: this is an instance of the Laplace 
method for sums briefly explained in Appendix B.6: Laplace’s method, p. 755 (see 
also [377, Sec. 1.2.11.3] and [538, Sec. 4.7]). In fact, very precise estimates come 
out naturally from an analysis of the singularities of the EGF K (z), as we shall see in 
Chapters VI (p. 416) and VII (p. 449). The net result is 


on ? 
so that a fraction about 1/,/n of all the graphs consist of a single component. 


Constrained mappings. As is customary with the symbolic method, basic con- 
structions open the way to a large number of related counting results (Figure II.13). 
First, by an adaptation of (48), the mappings without fixed points, (Vx : f(x) 4 x) and 
those without 1, 2—cycles, (additionally, Vx : f (f(x)) # x), have EGFs, respectively, 


e7T® eT @-T*@)/2 
1— T(z)’ 1-—T() 
The first term is consistent with what a direct count yields, namely (n — 1)”, which is 


asymptotic to e~!n”, so that the fraction of mappings without fixed point is asymptotic 
to e~!. The second one lends itself easily to complex asymptotic methods that give 


-T-T?/2 
e 
ee ~w e 3/2 yn", 
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EGF coefficient 
I 
Mappings: eG n” (p. 130) 
ted ise" n/= (pp. 130, 449) 
connecte: 0) ~n" | — . 130, 
rea an VER 
-T 
“ : 7 =n 
no fixed-point ~e on (p. 130) 
a n” 
idempotent eX? ~ (pp. 131, 571) 
; (logn)" 
partial — " ~ en" (p. 132) 


Figure II.13. A summary of various counting results relative to mappings, with T = 
T (z) the Cayley tree function. (Bijections, surjections, involutions, and injections are 
covered by previous constructions.) 


and the proportion is asymptotic to e~?/?. These two particular estimates are of 


the same form as that found for permutations (the generalized derangements, Equa- 
tion (43)). Such facts are not quite obvious by elementary probabilistic arguments, but 
they are neatly explained by the singular theory of combinatorial schemas developed 
in Part B of this book. 

Next, idempotent mappings, i.e., ones satisfying f(f(x)) = f(x) for all x, cor- 
respond to Z = SET(Z x SET(Z)), so that 


n 
I(z) =e and Le (jes 
Ar 
(The specification translates the fact that idempotent mappings can have only cycles 
of length 1 on which are grafted sets of direct antecedents.) The latter sequence 
is EIS A000248, which starts as 1,1,3,10,41,196,1057. An asymptotic estimate can 
be derived either from the Laplace method or, better, from the saddle-point method 
expounded in Chapter VIII (p. 571). 

Several analyses of this type are of relevance to cryptography and the study of 
random number generators. For instance, the fact that a random mapping over [1 . .1] 
tends to reach a cycle in O(./n) steps (Subsection VII. 3.3, p. 462) led Pollard to 
design a surprising Monte Carlo integer factorization algorithm; see [378, p. 371] 
and [538, Sec 8.8], as well as our discussion in Example VII.11, p. 465. This al- 
gorithm, once suitably optimized, first led to the factorization of the Fermat number 
Fs = 27° + 1 obtained by Brent in 1980. 


> II.20. Binary mappings. The class BF of binary mappings, where each point has either 0 
or 2 preimages, is specified by 


BF = SET(K), K=Cyc(P), P= 2x B, B= Z x SET9 2(B) 
(planted trees P and binary trees B are needed), so that 
1 2n)!)* 
. ee ON 
V1 —222 2" (nl)? 


BF(z)= 
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The class BF is an approximate model of the behaviour of (modular) quadratic functions under 
iteration. See [18, 247] for a general enumerative theory of random mappings including degree- 
restricted ones. 


> IL.21. Partial mappings. A partial mapping may be undefined at some points, and at those 
we consider it takes a special value, L. The iterated preimages of 1 form a forest, while 
the remaining values organize themselves into a standard mapping. The class PF of partial 
mappings is thus specified by PF = SET(T) x Ff, so that 


ef &) 
1 — T(z) 
This construction lends itself to all sorts of variations. For instance, the class P F'I of injective 


partial maps is described as sets of chains of linear and circular graphs, PF] = SET(CYC(Z)+ 
SEQ>1(Z)), So that 


PF(z)= and PF, =(n+1)". 


1 : 2 2 
PFI(2) = oe ae Pri >(") 
rae ixo \ 
(This is a symbolic rewriting of part of the paper [78]; see Example VIIL.13, p. 596, for asymp- 
totics.) <i 


II. 5.3. Labelled graphs. Random graphs form a major chapter of the theory of 
random discrete structures [76, 355]. We examine here enumerative results concerning 
graphs of low “complexity”, that is, graphs which are very nearly trees. (Such graphs 
for instance play an essential rdle in the analysis of early stages of the evolution of a 
random graph, when edges are successively added, as shown in [241, 354].) 

Unrooted trees and acyclic graphs. The simplest of all connected graphs are 
certainly the ones that are acyclic. These are trees, but contrary to the case of Cayley 
trees, no root is specified. Let U/ be the class of all unrooted trees. Since a rooted tree 
(rooted trees are, as we know, counted by 7, = n'—1) is an unrooted tree combined 
with a choice of a distinguished node (there are n such possible choices for trees of 
size n), one has 

T, =nUy implying U, =n"?, 


At generating function level, this combinatorial equality translates into 
2 dw 
u@= [rw =, 
0 W 
which integrates to give (take T as the independent variable) 


U(z) =T(z)- SF te) 


Since U(z) is the EGF of acyclic connected graphs, the quantity 
A(z) = eY® = eT @O-TH’/2 


is the EGF of all acyclic graphs. (Equivalently, these are unordered forests of unrooted 
trees; the sequence is EJS A001858: 1, 1, 2, 7, 38, 291, ...) Singularity analysis meth- 
ods (Note VI.14, p. 406) imply the estimate A, ~ e!/*n"~?. Surprisingly, perhaps, 
there are barely more acyclic graphs than unrooted trees—such phenomena are easily 
explained by singularity analysis. 


I. 5, LABELLED TREES, MAPPINGS, AND GRAPHS 133 


Unicyclic graphs. The excess of a graph is defined as the difference between the 
number of edges and the number of vertices. For a connected graph, this quantity must 
be at least —1, this minimal value being precisely attained by unrooted trees. The class 
W, is the class of connected graphs of excess equal to k; in particular 7/ = W_,. The 
successive classes W_1, Wo, W, ..., may be viewed as describing connected graphs 
of increasing complexity. 

The class Wo comprises all connected graphs with the number of edges equal to 
the number of vertices. Equivalently, a graph in Wo is a connected graph with exactly 
one cycle (a sort of “eye’), and for that reason, elements of Wo are sometimes re- 
ferred to as “unicyclic components” or “unicycles’”. In a way, such a graph looks very 
much like an undirected version of a connected functional graph. In precise terms, a 
graph of Wo consists of a cycle of length at least 3 (by definition, graphs have neither 
loops nor multiple edges) that is undirected (the orientation present in the usual cycle 
construction is killed by identifying cycles isomorphic up to reflection) and on which 
are grafted trees (these are implicitly rooted by the point at which they are attached 
to the cycle). With UCYC representing the (new) undirected cycle construction, one 
thus has 

Wo = UCyc33(T). 


We claim that this construction is reflected by the EGF equation 


1 1 ener 
1 TO - 5F @) = qi @ : 


1 
(51) Wo(z) = 5 log 
Indeed one has the isomorphism 

Wo + Wo = CrYcs3(T), 


since we may regard the two disjoint copies on the left as instantiating two possible 
orientations of the undirected cycle. The result of (51) then follows from the usual 
translation of the cycle construction—it is originally due to the Hungarian probabilist 
Rényi in 1959. Asymptotically, one finds (using methods of Chapter VI, p. 406): 


1 
(52) n'[z"|Wo ~ qv aan, 


(The sequence starts as 0, 0, 1, 15, 222, 3660, 68295 and is ETS A057500.) 


Finally, the number of graphs made only of trees and unicyclic components has 


EGF 
el /2-3T"/4 


a 
which asymptotically yields n![z”JeW-!+Wo ~ 1(3/4)(e)7'/4a—-!/2n"-!/4, Such 
graphs stand just next to acyclic graphs in order of structural complexity. They are the 
undirected counterparts of functional graphs encountered in the previous subsection. 


> 11.22. 2—-Regular graphs. This is based on Comtet’s account [129, Sec. 7.3]. A 2-regular 
graph is an undirected graph in which each vertex has degree exactly 2. Connected 2-regular 
graphs are thus undirected cycles of length n > 3, so that their class R satisfies 


en 2/2-27/4 


eW-1@)+Wo) — 


(53) R=Sen(UCYe>3(Z)) = R@) = —> 
ate’ G 
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EGF coefficient 
Graphs: gn(n-1)/2 
acyclic, connected U=W_1=T- 77/9 nn—2 
acyclic (forest) Asel-P 2 ~ el/2p_n-2 
1 1 Pa ; 
icycl Wo = -1 pone ~ 1 fag pn—1/2 
unicycle eae Cane 5 ji 1 /2an 
gE Qe)4 1/4 
set of trees & unicycles B= ~TG/4 nt 
y —F G/4) = 
Py (T Py (A) 20 
connected, excess k = _Pa(T) 29 «(LV 2a nt tGk-1)/2 
(i — T)3* 23k/2T (3k/2) 


Figure II.14._ A summary of major enumeration results relative to labelled graphs. 
The asymptotic estimates result from singularity analysis (Note VI.14, p. 406). 


Given n straight lines in general position in the plane, a cloud is defined to be a set of n inter- 
section points, no three being collinear. Clouds and 2-regular graphs are equinumerous. [Hint: 
Use duality.] The asymptotic analysis will serve as a prime example of the singularity analysis 
process (Examples VI.1, p. 379 and VI.2, p. 395). 

The general enumeration of r—regular graphs becomes somewhat more difficult as soon 
as r > 2. Algebraic aspects are discussed in [289, 303] while Bender and Canfield [39] have 
determined the asymptotic formula (for rn even) 


2 
(54) RO ~ Vel-D/4 Lee 
et /2r! 
for the number of r—regular graphs of size n. (See also Example VIII.9, p. 583, for regular 
multigraphs.) 


Graphs of fixed excess. The previous discussion suggests considering more gen- 
erally the enumeration of connected graphs according to excess. E. M. Wright made 
important contributions in this area [620, 621, 622] that are revisited in the famous 
“giant paper on the giant component” by Janson, Knuth, Luczak, and Pittel [354]. 
Wright’s result are summarized by the following proposition. 

Proposition II.6. The EGF W;(z) of connected graphs with excess (of edges over 

vertices) equal to k is, fork > 1, of the form 
P(T) 

(1 — T)3k° 


where Px is a polynomial of degree 3k + 2. For any fixed k, asn — 00, one has 


(55) Wiz) = T=T(), 


Py) V 2a s _ 
= n — _PKUIM ET n+(3k-1)/2 1/2 
(56) Wen = lle" WAC) = sar GE Dy” (1+ 007"). 

The combinatorial part of the proof (see Note I.23 below) is an interesting ex- 
ercise in graph surgery and symbolic methods. The analytic part of the statement 
follows straightforwardly from singularity analysis. The polynomials P(7) and the 
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constants P;(1) are determined by an explicit nonlinear recurrence; one finds for in- 
stance: 

_ 17*6-T) os PIG S37 rT) 
“i ere i AB (i —T)6 
> 1123. Wright’s surgery. The full proof of Proposition II.6 by symbolic methods requires 
the notion of pointing in conjunction with multivariate generating function techniques of Chap- 


ter III. It is convenient to define wx (z, y) := yk W;(zy), which is a bivariate generating function 
with y marking the number of edges. Pick up an edge in a connected graph of excess k + 1, 
then remove it. This results either in a connected graph of excess k with two pointed vertices 
(and no edge in between) or in two connected components of respective excess h and k — h, 
each with a pointed vertex. Graphically (with connected components in grey): 


(*) 
k+1 


20yW41 = (2020p - 2ydywx) + = (zdzwp) - (20, wen) 5 
h=-1 


1 


This translates into the differential recurrence on the wx (0, := 4), 


and similarly for W,(z) = w x(z, 1). From here, it can be verified by induction that each W; 
is a rational function of T = W_,. (See Wright’s original papers [620, 621, 622] or [354] for 
details; constants related to the P; (1) occur in Subsection VII. 10.1, p. 532.) dq 

As explained in the giant paper [354], such results combined with complex ana- 
lytic techniques provide, with great detail, information about a random graph ['(n, m) 
with n nodes and m edges. In the sparse case where m is of the order of n, one finds the 
following properties to hold “with high probability” (w.h.p.)’; that is, with probability 
tending to lasn > oo. 


e Form = un, with uw < x the random graph ['(m, 7) has w.h.p. only tree 
and unicycle components; the largest component is w.h.p. of size O(logn). 

e Form = zn + O(n?/3), w.h.p. there appear one or several semi-giant 
components that have size O(n’). 

e Form = un, with u > 7 there is w.h.p. a unique giant component of size 
proportional to n. 


In each case, refined estimates follow from a detailed analysis of corresponding gen- 
erating functions, which is a main theme of [241] and especially [354]. Raw forms 
of these results were first obtained by Erdés and Rényi who launched the subject in a 
famous series of papers dating from 1959-60; see the books [76, 355] for a probabilis- 
tic context and the paper [40] for the finest counting estimates available. In contrast, 
the enumeration of all connected graphs (irrespective of the number of edges, that is, 
without excess being taken into account) is a relatively easy problem treated in the 


7Synonymous expressions are “asymptotically almost surely” (a.a.s) and “in probability”. The term 
“almost surely” is sometimes used, though it lends itself to confusion with properties of continuous 
measures. 
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next section. Many other classical aspects of the enumerative theory of graphs are 
covered in the book Graphical Enumeration by Harary and Palmer [319]. 


> IL24. Graphs are not specifiable. The class of all graphs does not admit a specification that 
starts from single atoms and involves only sums, products, sets and cycles. Indeed, the growth 
of Gp is such that the EGF G(z) has radius of convergence 0, whereas EGFs of constructible 
classes must have a non-zero radius of convergence. (Section IV. 4, p. 249, provides a detailed 
proof of this fact for iterative structures; for recursively specified classes, this is a consequence 
of the analysis of inverse functions, p. 402, and systems, p. 489, with suitable adaptations based 
on the technique of majorant series. p. 250.) dq 


II. 6. Additional constructions 


As in the unlabelled case, pointing and substitution are available in the world of 
labelled structures (Subsection I. 6.1), and implicit definitions enlarge the scope of 
the symbolic method (Subsection II. 6.2). The inversion process needed to enumer- 
ate implicit structures is even simpler, since in the labelled universe sets and cycles 
have more concise translations as operators over EGF. Finally, and this departs sig- 
nificantly from Chapter I, the fact that integer labels are naturally ordered makes it 
possible to take into account certain order properties of combinatorial structures (Sub- 
section IT. 6.3). 


II. 6.1. Pointing and substitution. The pointing operation is entirely similar to 
its unlabelled counterpart since it consists in distinguishing one atom among all the 
ones that compose an object of size n. The definition of composition for labelled struc- 
tures is however a bit more subtle as it requires singling out “leaders” in components. 


Pointing. The pointing of a class B is defined by 
A= 0B iff An =[1..n] x By. 


In other words, in order to generate an element of A, select one of the n labels and 
point at it. Clearly 


d 
An =n-B, => A(z) =z—B(z). 
dz 
Substitution (composition). Composition or substitution can be introduced so 


that it corresponds a priori to composition of generating functions. It is formally 
defined as 


CO 
BoC= > B x SET; (C), 


k=0 
so that its EGF is 
oo k 
C(z 
yor = B(C(z)). 
k=0 ‘ 


A combinatorial way of realizing this definition and forming an arbitrary object of 
B oC, is as follows. First select an element of 6 € B called the “base” and let k = |f| 
be its size; then pick up a k—set of elements of C; the elements of the k—set are naturally 
ordered by the value of their “leader” (the Jeader of an object being by convention the 
value of its smallest label); the element with leader of rank r is then substituted to the 
node labelled by value r of 6. Gathering the above, we obtain: 
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Theorem II.3. The combinatorial constructions of pointing and substitution are ad- 
missible 


d 
A=@0B = A(z) =2z6-B(z), a. = qe 
A=BoC = A(z) = B(C(z)). 
For instance, the EGF of (relabelled) pairings of elements drawn from C is 


ec /2, 


: i : ‘ Soy Sit 8D 
since the EGF of involutions without fixed points is e@ /. 


> II.25. Standard constructions based on substitutions. The sequence class of A may be de- 
fined by composition as P oA where P is the set of all permutations. The set class of A may be 
defined as U o A where U is the class of all urns. Similarly, cycles are obtained by substitution 
into circular graphs. Thus, 


SEQ(A) = Po A, SET(A) =Uo A, Cyc(A) =CoA. 


In this way, permutation, urns and circle graphs appear as archetypal classes in a development 
of combinatorial analysis based on composition. (Joyal’s “theory of species” [359] and the 
book by Bergeron, Labelle, and Leroux [50] show that a far-reaching theory of combinatorial 
enumeration can be based on the concept of substitution.) dq 


> II.26. Distinct component sizes. The EGFs of permutations with cycles of distinct lengths 
and of set partitions with parts of distinct sizes are 


oo zn oo zn 

14+ — 1+—}. 
Ge) TGs) 
n=1 n=1 


The probability that a permutation of P, has distinct cycle sizes tends to e~”; see [309, 
Sec. 4.1.6] for a Tauberian argument and [495] for precise asymptotics. The corresponding 
analysis for set partitions is treated in the seven-author paper [368]. 


II. 6.2. Implicit structures. Let V be a labelled class implicitly characterized 
by either of the combinatorial equations 


A=B+4%, A=BxX. 
Then, solving the corresponding EGF equations leads to 


_ A(z) 

B(z)’ 
respectively. For the composite labelled constructions SEQ, SET, CYC, the algebra is 
equally easy. 


X(z) = A(z) — Bz), X(z) 


Theorem II.4 (Implicit specifications). The generating functions associated with the 
implicit equations in X 


A = SEQ(“), A= SET(*), A=CyYC(4), 


are, respectively, 


X() =1—- X() =log A(z), X(z) = 1-7 4®, 


2 
A(z)’ 
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Example 11.15. © Connected graphs. In the context of graphical enumerations, the labelled 
set construction takes the form of an enumerative formula relating a class G of graphs and the 
subclass K C G of its connected graphs: 


G=SET(K) => G(z) = eK), 


This basic formula is known in graph theory [319] as the exponential formula. 
Consider the class G of all (undirected) labelled graphs, the size of a graph being the 
number of its nodes. Since a graph is determined by the choice of its set of edges, there are (5) 


potential edges each of which may be taken in or out, so that Gn = 20). Let K C G be the 
subclass of all connected graphs. The exponential formula determines K (z) implicitly: 


K(z) = tog (1+ 526 =) 


(57) n>| 
22 2 zt 2 
Se aa seg ea tee 


where the sequence is EJS A001187. The series is divergent, that is, it has radius of conver- 
gence 0. It can nonetheless be manipulated as a formal series (Appendix A.5: Formal power 
series, p. 730). Expanding by means of log(1 + u) = u — u> /2 +.---, yields a complicated 
convolution expression for Kn: 
n 1 n My 4 (M2 1 n My 4 (72) 4 ("3 
K, -—20) — — WG) 4 = MDACAG) _ oe 
‘ AG) +32 1,N2,N3 

(The kth term is asum over n, +--+ +ng =n, withO <nj <n.) Given the very fast increase 


of Gy with n, for instance 
n+l 


a>) = 2" 2G), 
a detailed analysis of the various terms of the expression of K, shows predominance of the first 
sum, and, in that sum itself, the extreme terms corresponding tony = n—1lorng =n-1 
predominate, so that 


(58) Ky = 2) (1 — 227" +027”). 


Thus: almost all labelled graphs of size n are connected. In addition, the error term decreases 
very quickly: for instance, for n = 18, an exact computation based on the generating function 
formula reveals that a proportion only 0.0001373291074 of all the graphs are not connected— 
this is extremely close to the value 0.00013732910/6 predicted by the main terms in the asymp- 
totic formula (58). Notice that good use could be made here of a purely divergent generating 
function for asymptotic enumeration purposes. .......... 0... c cece eee ee een nee aee | 


> 11.27. Bipartite graphs. A plane bipartite graph is a pair (G, w) where G is a labelled graph, 
@ = (@w, @£) is a bipartition of the nodes (into West and East categories), and the edges are 
such that they only connect nodes from wy to nodes of wz. A direct count shows that the EGF 
of plane bipartite graphs is 


n 
T(z) = ss Pn with yp = Sy ae 
n . k 


The EGF of plane bipartite graphs that are connected is log T(z). 
A bipartite graph is a labelled graph whose nodes can be partitioned into two groups so 
that edges only connect nodes of different groups. The EGF of bipartite graphs is 


exp (; log re) = /I(z). 
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{Hint. The EGF of a connected bipartite graph is A log I'(z), since a factor of ; kills the East— 


West orientation present in a connected plane bipartite graph. See Wilf’s book [608, p. 78] for 
details.] <i 


> IL.28. Do two permutations generate the symmetric group? To two permutations o, t of the 
same size, associate a graph I',,, whose set vertices is V = [1 ..n], ifn = |o| = |t|, and set of 
edges is formed of all the pairs (x, 0 (x)), (x, t(x)), for x € V. The probability that a random 
Tg,r is connected is 


1 
= pen 1-2 
In = nile Jlog Sialz 
n>0 
This represents the probability that two permutations generate a transitive group (that is for all 
x,y € [0..n], there exists a composition of o, a}, ts; t~! that maps x to y). One has 


59 a 1 1 4 23 171 1542 
C2 an no on2 nd nt nd n® oy 
Surprisingly, the coefficients 1, 1, 4, 23, .. . (EIS A084357) in the asymptotic formula (59) enu- 
merate a “third-level” structure (Subsection II. 4.2, p. 124 and Note VHI.15, p. 571), namely: 
SET(SET3 1 (SEQ>1(Z))). In addition, one has nl? ay = (n—1)!In, where I,+1 is the number 
of indecomposable permutations (Example I.19, p. 89). 

Let 7 be the probability that two random permutations generate the whole symmetric 
group. Then, by a result of Babai based on the classification of groups, the quantity z, — z7 is 
exponentially small, so that (59) also applies to 7X; see Dixon [167]. 


II. 6.3. Order constraints. A construction well-suited to dealing with many of 
the order properties of combinatorial structures is the modified labelled product: 


A= (B- xC). 


This denotes the subset of the product BxC formed with elements such that the smallest 
label is constrained to lie in the B component. (To make this definition consistent, it 
must be assumed that Bo = 0.) We call this binary operation on structures the boxed 
product. 


Theorem II.5. The boxed product is admissible: 


(60) A=(B°x:) = Aw = [| @.@)-cwar, =. 
0 


Proof. The definition of boxed products implies the coefficient relation 


The binomial coefficient that appears in the standard convolution, Equation (2), p. 100, 
is to be modified since only n — 1 labels need to be distributed between the two compo- 
nents: k — 1 go to the B component (that is already constrained to contain the label 1) 
and n — k to the C component. From the equivalent form 


n 


1 n 
An = - B n—k> 
PAG ie Cae 


the result follows by taking EGFs, via A(z) = (6,B(z)) - C(z). | 
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Figure 11.15. A numerical sequence of size 100 with records marked by circles: 
there are 7 records that occur at times 1, 3, 5, 11, 60, 86, 88. 


A useful special case is the min-rooting operation, 
A= (2 xC) ; 


for which a variant definition goes as follows: take in all possible ways elements 
y €C, prepend an atom with a label, for instance 0, smaller than the labels of y, and 
relabel in the canonical way over [1 .. (7+ 1)] by shifting all label values by 1. Clearly 
An+1 = Cn, which yields 


A(z) = [ C(t) dt, 


a result which is also consistent with the general formula (60) of boxed products. 
For some applications, it is convenient to impose constraints on the maximal label 
rather than the minimum. The max-boxed product written 


A= (BE x0), 


is then defined by the fact the maximum is constrained to lie in the B—component of 
the labelled product. Naturally, translation by an integral in (60) remains valid for this 
trivially modified boxed product. 


> I1L.29. Combinatorics of integration. In the perspective of this book, integration by parts has 
an immediate interpretation. Indeed, the equality 


z z 
| A’(t)- B(t) dt +f A(t). B'(t) dt = A(z) - B(z) 
0 0 
reads: “The smallest label in an ordered pair appears either on the left or on the right.” J 


Example 11.16. Records in permutations. Given a sequence of numbers x = (x1,..., Xn), 
assumed all distinct, a record is defined to be an element x; such that xz <x; forallk < j.(A 
record is an element “better” than its predecessors!) Figure II.15 displays a numerical sequence 
of length n = 100 that has 7 records. Confronted by such data, a statistician will typically 
want to determine whether the data obey purely random fluctuations or if there could be some 
indications of a “trend” or of a “bias” [139, Ch. 10]. (Think of the data as reflecting share prices 
or athletic records, say.) In particular, if the x; are independently drawn from a continuous 
distribution, then the number of records obeys the same laws as in a random permutation of 
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[1..n]. This statistical preamble then invites the question: How many permutations of n have k 
records? 

First, we start with a special brand of permutations, the ones that have their maximum at 
the beginning. Such permutations are defined as (“M” indicates the boxed product based on the 
maximum label) 


Q=(z8x?), 


where P is the class of all permutations. Observe that this gives the EGF 


Q(z) if aa ; dt =1 ! 
— — . — — ity 7 
i 0 \dt 1-t Ppa 


implying the obvious result Q, = (n — 1)! for all n > 1. These are exactly the permutations 
with one record. Next, consider the class 


P® = Set; (Q). 


The elements of P“) are unordered sets of cardinality k with elements of type Q. Define 
the max-leader (“el lider maximo”) of any component of P*) as the value of its maximal 
element. Then, if we place the components in sequence, ordered by increasing values of their 
leaders, then read off the whole sequence, we obtain a permutation with exactly k records. The 
correspondence® is clearly revertible. Here is an illustration, with leaders underlined: 


{(7, 2,6, 1), (4,3), 98,5} = [@,3), (72,6,1), 9,8,5))] 
= 4,3,7,2,6,1,9,8,5. 


Thus, the number of permutations with k records is determined by 


1 1 \* 
rion alerts) ef 


where we recognize Stirling cycle numbers from Example II.12, p. 121. In other words: 


The number of permutations of size n having k records is counted by the 
Stirling “cycle” number [7]. 


Returning to our statistical problem, the treatment of Example II.12 p. 121 (to be revisited 
in Chapter III, p. 189) shows that the expected number of records in a random permutation of 
size n equals Hy, the harmonic number. One has Hjgo = 5.18, so that for 100 data items, a little 
more than 5 records are expected on average. The probability of observing 7 records or more 
is still about 23%, an altogether not especially rare event. In contrast, observing twice as many 
records as we did, namely 14, would be a fairly strong indication of a bias—on random data, 
the event has probability very close to 10-4. Altogether, the present discussion is consistent 
with the hypothesis for the data of Figure II.15 to have been generated independently at random 
(and.indéed théy:were):. 2... vosra acted iiiee is eerie iolsaas osbeste doh dake4 aba TPS || 


8This correspondence can also be viewed as a transformation on permutations that maps the number 
of records to the number of cycles—it is known as Foata’s fundamental correspondence [413, Sec. 10.2]. 
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It is possible to base a fair part of the theory of labelled constructions on sums and 
products in conjunction with the boxed product. In effect, consider the three relations 


1 
F=SEQG) => PO) = Faas f=l+ef 


F=serG) = fe =e, paitfe't 
1 is fl 
F=CyeG) => f= le 775, p= fe. 


The last column is easily checked, by standard calculus, to provide an alternative form 
of the standard operator corresponding to sequences, sets, and cycles. Each case can 
in fact be deduced directly from Theorem II.5 and the labelled product rule as follows. 


(i) Sequences: they obey the recursive definition 
F = SEQ(G) => F ={e)+ (GF). 
(ii) Sets: we have 
F=SerGQ) = Fz {e}+G8F), 


which means that, in a set, one can always single out the component with 
the largest label, the rest of the components forming a set. In other words, 
when this construction is repeated, the elements of a set can be canonically 
arranged according to increasing values of their largest labels, the “leaders”. 
(We recognize here a generalization of the construction used for records in 
permutations.) 

(iii) Cycles: The element of a cycle that contains the largest label can be taken 
canonically as the cycle “starter”, which is then followed by an arbitrary 
sequence of elements upon traversing the cycle in cyclic order. Thus 


F=Cyc(G = FX (GE«SEQG)). 


Greene [308] has developed a complete framework of labelled grammars based 
on standard and boxed labelled products. In its basic form, its expressive power is 
essentially equivalent to ours, because of the above relations. More complicated order 
constraints, dealing simultaneously with a collection of larger and smaller elements, 
can be furthermore taken into account within this framework. 


> IL.30. Higher order constraints, after Greene. Let the symbols 1, £], Ml represent smallest, 
second smallest, and largest labels, respectively. One has the correspondences (with 0; = ) 
A= (B°xcl) 62 A(z) = (6B(2)) - (CC) 
A= (8 x0) a2 A(z) = (228) -C(2) 


A= (8 cE +m) 63 A(z) = (0; B(2)) - (6-C(2)) - (0D), 


and so on. These can be transformed into (iterated) integral representations. (See [308] for 
more.) 


The next three examples demonstrate the utility of min/max-rooting used in con- 
junction with recursion. Examples II.17 and II.18 introduce two important classes of 
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Figure 11.16. A permutation of size 7 and its increasing binary tree lifting. 


trees that are tightly linked to permutations. Example II.19 provides a simple symbolic 
solution to a famous parking problem, on which many analyses can be built. 


Example 11.17. Increasing binary trees and alternating permutations. To each permutation, 
one can associate bijectively a binary tree of a special type called an increasing binary tree 
and sometimes a heap-ordered tree or a tournament tree. This is a plane rooted binary tree in 
which internal nodes bear labels in the usual way, but with the additional constraint that node 
labels increase along any branch stemming from the root. Such trees are closely related to many 
classical data structures of computer science, such as heaps and binomial queues. 

The correspondence (Figure II.16) is as follows: Given a permutation written as a word, 
0 =0102...0n, factor it into the form o = oy - min(a) - op, with min(c) the smallest label 
value in the permutation, and oy, op the factors left and right of min(c). Then the binary tree 
/(c) is defined recursively in the format (root, left, right) by 


B(o) = (min(o), Bo), B(Rr)), = Be) = €. 
The empty tree (consisting of a unique external node of size 0) goes with the empty permutation 
€. Conversely, reading the labels of the tree in symmetric (infix) order gives back the original 
permutation. (The correspondence is described for instance in Stanley’s book [552, p. 23-25] 
who says that “it has been primarily developed by the French’, pointing at [267].) 
Thus, the family Z of binary increasing trees satisfies the recursive definition 


(61) T= {e}+(Z- x«ITxZ), 


which implies the nonlinear integral equation for the EGF 
5 pte 
I(z)=1 +f I(t) dt. 
0 


This equation reduces to I’(z) = I (z)? and, under the initial condition /(0) = 1, it admits the 
solution /(z) = (1 — gu. Thus J, = n!, which is consistent with the fact that there are as 
many increasing binary trees as there are permutations. 

The construction of increasing trees is instrumental in deriving EGFs relative to various 
local order patterns in permutations. We illustrate its use here by counting the number of 
up-and-down (or zig-zag) permutations, also known as alternating permutations. The result, 
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already mentioned in our /nvitation chapter (p. 2) was first derived by Désiré André in 1881 by 
means of a direct recurrence argument. 
A permutation 0 = 0102 --- on 1s an alternating permutation if 


(62) 0, > 02 < 03> 04 <:'', 


so that pairs of consecutive elements form a succession of ups and downs; for instance, 


é ane 
\/ > 


(eri si Wi, cass) 


Consider first the case of an alternating permutation of odd size. It can be checked that the 
corresponding increasing trees have no one-way branching nodes, so that they consist solely of 
binary nodes and leaves. Thus, the corresponding specification is 


FLA KIRD) 


so that 
z 2 d 2 
i@=z+ [ J(t)dt — and 7 . 
0 


The equation admits separation of variables, which implies, since J (0) = 0, that arctan(J(z)) = 


z, hence: 
3 5 7 


z Z Zz 
J@) = tan) = 2425-4165 +2725, +0. 


The coefficients Jz,4 1 are known as the tangent numbers or the Euler numbers of odd index 
(ETS A000182). 

Alternating permutations of even size defined by the constraint (62) and denoted by K can 
be determined from 


K={e}+(Z>*I«K), 


since now all internal nodes of the tree representation are binary, except for the right-most one 
that only branches on the left. Thus, K’(z) = tan(z) K (z), and the EGF is 


1 ag ze 28 ra 
K(zZ)= eG 1+ ey +35 +615 + 138) 5) sPath 
where the coefficients K , are the secant numbers also known as Euler numbers of even index 
(EIS A000364). 

Use will be made later in this book (Chapter III, p. 202) of this important tree represen- 
tation of permutations as it opens access to parameters such as the number of descents, runs, 
and (once more!) records in permutations. Analyses of increasing trees also inform us of cru- 
cial performance issues regarding binary search trees, quicksort, and heap-like priority queue 
structures: [429:°538;598 > 600]. ek iid ee a da Ae Renal tiartle ah gist Gotten atetalcatat |_| 


> 11.31. Combinatorics of trigonometrics. Interpret tan i tantanz, tan(e* — 1) as EGFs of 


combinatorial classes. <q 
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Figure 11.17. An increasing Cayley tree (left) and its associated regressive mapping (right). 


Example 11.18. — Increasing Cayley trees and regressive mappings. An increasing Cayley 
tree is a Cayley tree (i.e., it is labelled, non-plane, and rooted) whose labels along any branch 
stemming from the root form an increasing sequence. In particular, the minimum must occur 
at the root, and no plane embedding is implied. Let £ be the class of such trees. The recursive 
specification is now 


L=(Z2- xSET(L)): 


The generating function thus satisfies the functional relations 
z 
L@)= | XOd, L@=eF®, 
0 


with L(0) = 0. Integration of L’e~£ = 1 shows that e~4 = 1 — z, hence 


1 
L(z) = log i 


and Ln =(n—-1)!. 


Thus the number of increasing Cayley trees is (n—1)!, which is also the number of permutations 
of size n — 1. These trees have been studied by Meir and Moon [435] under the name of 
“recursive trees”, a terminology that we do not, however, retain here. 

The simplicity of the formula L, = (n — 1)! certainly calls for a combinatorial interpreta- 
tion. In fact, an increasing Cayley tree is fully determined by its child—parent relationship 
(Figure II.17). In other words, to each increasing Cayley tree t, we associate a partial map 
gd = ¢, such that d(i) = j iff the label of the parent of i is j. Since the root of tree is an 
orphan, the value of ¢(1) is undefined, ¢(1) =; since the tree is increasing, one has (i) <i 
for alli > 2. A function satisfying these last two conditions is called a regressive mapping. The 
correspondence between trees and regressive mappings is then easily seen to be bijective. 

Thus regressive mappings on the domain [1..n] and increasing Cayley trees are equi- 
numerous, so that we may as well use CL to denote the class of regressive mappings. Now, a 
regressive mapping of size n is evidently determined by a single choice for ¢(2) (since (2) = 
1), two possible choices for ¢ (3) (either of 1, 2), and so on. Hence the formula 


Lyn =1x2x3x---x(n—-1) 
receives a natural interpretation. ............ 0 ccc cece eee cee ence ne ene ene eeeeeneee | 


> IL.32. Regressive mappings and permutations. Regressive mappings can be related directly 
to permutations. The construction that associates a regressive mapping to a permutation is 
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called the “inversion table” construction; see [378, 538]. Given a permutation o = 0],...,0n, 
associate to it a function y = wg from [1..n] to [(0..n — 1] by the rule 


yw(j) = card {k <j | o% > aj}. 
The function y is a trivial variant of a regressive mapping. dq 


> IL.33. Rotations and increasing trees. An increasing Cayley tree can be canonically drawn 
by ordering descendants of each node from left to right according to their label values. The 
rotation correspondence (p. 73) then gives rise to a binary increasing tree. Hence, increasing 
Cayley trees and increasing binary trees are also directly related. Summarizing this note and 
the previous one, we have a quadruple combinatorial connection, 

Increasing Cayley trees = Regressive mappings = Permutations = Increasing binary trees, 


which opens the way to yet more permutation enumerations. dq 


Example 11.19. A parking problem. Here is Knuth’s introduction to the problem, dating 
back from 1973 (see [378, p. 545]), which nowadays might be regarded by some as politically 
incorrect: 


“A certain one-way street has m parking spaces in a row numbered 1 to m. A man and his 
dozing wife drive by, and suddenly, she wakes up and orders him to park immediately. He 
dutifully parks at the first available space [...].” 


Consider n = m — | cars and condition by the fact that everybody eventually finds a parking 
space and the last space remains empty. There are m” = (n + 1)” possible sequences of 
“wishes”, among which only a certain number F;, satisfy the condition—this number is to be 
determined. (An important motivation for this problem is the analysis of hashing algorithms 
examined in Note III.11, p. 178, under the “linear probing” strategy.) 

A sequence satisfying the condition called an almost-full allocation, its size n being the 
number of cars involved. Let F represent the class of almost-full allocations. We claim the 
decomposition: 


(63) ee [r+ Fx ZB], 


Indeed, consider the car that arrived last, before it will eventually land in some position k + 1 
from the left. Then, there are two islands, which are themselves almost-full allocations (of 
respective sizes k and n — k — 1). This last car’s intended parking wish must have been either 
one of the first k occupied cells on the left (the factor ©F in (63)) or the last empty cell of the 
first island (the term F in the left factor); the right island is not affected (the factor F on the 
right). Finally, the last car is inserted into the street (the factor Z a) Pictorially, we have a sort 
of binary tree decomposition of almost-full allocations: 


Analytically, the translation of (63) into EGF is 


Zz 
(64) F(2)= i (wF'(w) + F(w))F(w) dw, 
0 
which, through differentiation gives 


(65) F'(z) = (2F(2))’ - F(). 
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Simple manipulations do the rest: we have F’/F = (zF)', which by integration gives log F = 
(<F) and F = e*". Thus F(z) satisfies a functional equation strangely similar to that of the 
Cayley tree function T(z); indeed, it is not hard to see that one has 


1 
(66) FQ)==T@) and Fy = (n+ 1)"7!, 


which solves the original counting problem. The derivation above is based on articles by Fla- 
jolet, Poblete, Viola, and Knuth [249, 380], who show that probabilistic properties of parking 
allocations can be precisely analysed (for instance, total displacement, examined in Note VII.54, 
p. 534, is found to be governed by an Airy distribution). .............. 0... e cece eee ee eee | 


II.7. Perspective 


Together with the previous chapter and Figure I.18, this chapter and Figure II.18 
provide the basis for the symbolic method that is at the core of analytic combinatorics. 
The translations of the basic constructions for labelled classes to EGFs could hardly 
be simpler, but, as we have seen, they are sufficiently powerful to embrace numerous 
classical results in combinatorics, ranging from the birthday and coupon collector 
problems to tree and graph enumeration. 

The examples that we have considered for second-level structures, trees, map- 
pings, and graphs lead to EGFs that are simple to express and natural to generalize. 
(Often, the simple form is misleading—direct derivations of many of these EGFs that 
do not appeal to the symbolic method can be rather intricate.) Indeed, the symbolic 
method provides a framework that allows us to understand the nature of many of these 
combinatorial classes. From here, numerous seemingly scattered counting problems 
can be organized into broad structural categories and solved in an almost mechanical 
manner. 

Again, the symbolic method is only half of the story (the “combinatorics” in an- 
alytic combinatorics), leading to EGFs for the counting sequences of numerous inter- 
esting combinatorial classes. While some of these EGFs lead immediately to explicit 
counting results, others require classical techniques in complex analysis and asymp- 
totic analysis that are covered in Part B (the “analytic” part of analytic combinatorics) 
to deliver asymptotic estimates. Together with these techniques, the basic construc- 
tions, translations, and applications that we have discussed in this chapter reinforce 
the overall message that the symbolic method is a systematic approach that is success- 
ful for addressing classical and new problems in combinatorics, generalizations, and 
applications. 

We have been focusing on enumeration problems—counting the number of ob- 
jects of a given size in a combinatorial class. In the next chapter, we shall consider 
how to extend the symbolic method to help analyse other properties of combinatorial 
classes. 


Bibliographic notes. The labelled set construction and the exponential formula were recog- 
nized early by researchers working in the area of graphical enumerations [319]. Foata [265] 
proposed a detailed formalization in 1974 of labelled constructions, especially sequences and 
sets, under the names of partitional complex; a brief account is also given by Stanley in his 
survey [550]. This is parallel to the concept of “prefab” due to Bender and Goldman [42]. The 
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1. The main constructions of union, and product, sequence, set, and cycle for labelled structures 
together with their translation into exponential generating functions. 


Construction EGF 
Union A=B+C A(z) = B(z) + C(z) 
Product A=BxC A(z) = B(z)- C(z) 
1 
Ss A = SEQ(B) | A(z) = ——— 
equence Q(B) (z) 1 B@ 
Set A = SET(B) | A(z) = exp(B(z)) 
Cycle A = Cyc(B) | A(z) = log ———— 
y (B) | AG) =log -—— @ 
2. Sets, multisets, and cycles of fixed cardinality. 
Construction EGF 
Sequence A = SEQ; (B) | A(z) = B(z)* 
1 
Set A = SET; (B) | A(z) = Bo 
1 
Cycle A= Cyc, (B) | A(z) = [Blo 
3. The additional constructions of pointing and substitution. 
Construction EGF 
Pointing A=0B | A(z)= 2 B(2) 
Substitution A=BoC | A(z) = B(C(z)) 


4. The “boxed” product. 


A= (B7xC) => ag =f (7,80) cea 


Figure II.18. A “dictionary” of labelled constructions together with their translation 
into exponential generating functions (EGFs). The first constructions are counterparts 
of the unlabelled constructions of the previous chapter (the multiset construction is 
not meaningful here). Translation for composite constructions of bounded cardinality 
appears to be simple. Finally, the boxed product is specific to labelled structures. 
(Compare with the unlabelled counterpart, Figure 1.18, p. 18.) 


books by Comtet [129], Wilf [608], Stanley [552], or Goulden and Jackson [303] have many 
examples of the use of labelled constructions in combinatorial analysis. 

Greene [308] has introduced in his 1983 dissertation a general framework of “labelled 
grammars” largely based on the boxed product with implications for the random generation of 
combinatorial structures. Joyal’s theory of species dating from 1981 (see [359] for the original 
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article and the book by Bergeron, Labelle, and Leroux [50] for a rich exposition) is based on 
category theory; it presents the advantage of uniting in a common framework the unlabelled and 
the labelled worlds. 

Flajolet, Salvy, and Zimmermann have developed a specification language closely related 
to the system expounded here. They show in [255] how to compile automatically specifica- 
tions into generating functions; this is complemented by a calculus that produces fast random 
generation algorithms [264]. 


I can see looming ahead one of those terrible exercises in probability where six men have 
white hats and six men have black hats and you have to work it out by mathematics how likely 
it is that the hats will get mixed up and in what proportion. If you start thinking about things 
like that, you would go round the bend. Let me assure you of that! 


—AGATHA CHRISTIE 
(The Mirror Crack’d. Toronto, Bantam Books, 1962.) 


Combinatorial Parameters and 
Multivariate Generating Functions 


Generating functions find averages, etc. 


— HERBERT WILF [608] 
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Many scientific endeavours demand precise quantitative information on probabilis- 
tic properties of parameters of combinatorial objects. For instance, when designing, 
analysing, and optimizing a sorting algorithm, it is of interest to determine the typi- 
cal disorder of data obeying a given model of randomness, and to do so in the mean, 
or even in distribution, either exactly or asymptotically. Similar situations arise in 
a broad variety of fields, including probability theory and statistics, computer sci- 
ence, information theory, statistical physics, and computational biology. The exact 
problem is then a refined counting problem with two parameters, namely, size and 
an additional characteristic: this is the subject addressed in this chapter and treated 
by a natural extension of the generating function framework. The asymptotic prob- 
lem can be viewed as one of characterizing in the limit a family of probability laws 
indexed by the values of the possible sizes: this is a topic to be discussed in Chap- 
ter IX. As demonstrated here, the symbolic methods initially developed for counting 
combinatorial objects adapt gracefully to the analysis of various sorts of parameters 
of constructible classes, unlabelled and labelled alike. 

Multivariate generating functions (MGFs)—ordinary or exponential—can keep 
track of a collection of parameters defined over combinatorial objects. From the 
knowledge of such generating functions, there result either explicit probability dis- 
tributions or, at least, mean and variance evaluations. For inherited parameters, all the 
combinatorial classes discussed so far are amenable to such a treatment. Technically, 
the translation schemes that relate combinatorial constructions and multivariate gen- 
erating functions present no major difficulty—they appear to be natural (notational, 
even) refinements of the paradigm developed in Chapters I and II for the univariate 
case. Typical applications from classical combinatorics are the number of summands 
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in a composition, the number of blocks in a set partition, the number of cycles in a 
permutation, the root degree or path length of a tree, the number of fixed points in a 
permutation, the number of singleton blocks in a set partition, the number of leaves in 
trees of various sorts, and so on. 

Beyond its technical aspects anchored in symbolic methods, this chapter also 
serves as a first encounter with the general area of random combinatorial structures. 
The general question is: What does a random object of large size look like? Multi- 
variate generating functions first provide an easy access to moments of combinatorial 
parameters—typically the mean and variance. In addition, when combined with basic 
probabilistic inequalities, moment estimates often lead to precise characterizations of 
properties of large random structures that hold with high probability. For instance, 
a large integer partition conforms with high probability to a deterministic profile, a 
large random permutation almost surely has at least one long cycle and a few short 
ones, and so on. Such a highly constrained behaviour of large objects may in turn 
serve to design dedicated algorithms and optimize data structures; or it may serve to 
build statistical tests—when does one depart from randomness and detect a “signal” 
in large sets of observed data? Randomness forms a recurrent theme of the book: it 
will be developed much further in Chapter IX, where the complex asymptotic meth- 
ods of Part B are grafted on the exact modelling by multivariate generating functions 
presented in this chapter. 

This chapter is organized as follows. First a few pragmatic developments related 
to bivariate generating functions are presented in Section III. 1. Next, Section II. 2 
presents the notion of bivariate enumeration and its relation to discrete probabilistic 
models, including the determination of moments, since the language of elementary 
probability theory does indeed provide an intuitively appealing way to conceive of bi- 
variate counting data. The symbolic method per se, declined in its general multivariate 
version, is centrally developed in Sections HI.3 and HI. 4: with suitable multi-index 
notations, the extension of the symbolic method to the multivariate case is almost im- 
mediate. Recursive parameters that often arise in particular from tree statistics form 
the subject of Section III. 5, while complete generating functions and associated com- 
binatorial models are discussed in Section II.6. Additional constructions such as 
pointing, substitution, and order constraints lead to interesting developments, in par- 
ticular, an original treatment of the inclusion—exclusion principle in Section HI. 7. The 
chapter concludes, in Section III. 8, with a brief abstract discussion of extremal param- 
eters like height in trees or smallest and largest components in composite structures— 
such parameters are best treated via families of univariate generating functions. 


II.1. An introduction to bivariate generating functions (BGFs) 


We have seen in Chapters I and II that a number sequence (/;,) can be encoded 
by means of a generating function in one variable, either ordinary or exponential: 


> faz" (ordinary GF) 


Gr) ~~ f@=7 2 aw 
> fa (exponential GF). 


n 
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foo — fol) 
fio fil — fil) 
f20 f21 f22 — fru) 


J J J 
Gy fe) 


Figure III.1. An array of numbers and its associated horizontal and vertical GFs. 


This encoding is powerful, since many combinatorial constructions admit a translation 
as operations over such generating functions. In this way, one gains access to many 
useful counting formulae. 

Similarly, consider a sequence of numbers (f,,,) depending on two integer-valued 
indices, n and k. Usually, in this book, (f,,,) will be an array of numbers (often a trian- 
gular array), where f,,,; is the number of objects g in some class F, such that |g| =n 
and some parameter y(@) is equal to k. We can encode this sequence by means of 
a bivariate generating function (BGF) involving two variables: a primary variable z 
attached to n and a secondary u attached to k. 


Definition III.1. The bivariate generating functions (BGFs), either ordinary or ex- 
ponential, of an array (fn,x) are the formal power series in two variables defined by 


YS hare (ordinary BGF) 
n,k 


n 
Dy nk = yi (exponential BGF). 
n} 
n,k 


f(zZ,u) = 


(The “double exponential” GF corresponding to a e is not used in the book.) 

As we Shall see shortly, parameters of constructible classes become accessible 
through such BGFs. According to the point of view adopted for the moment, one 
starts with an array of numbers and forms a BGF by a double summation process. We 
present here two examples related to binomial coefficients and Stirling cycle numbers 
illustrating how such BGFs can be determined, then manipulated. In what follows it 
is convenient to refer to the horizontal and vertical generating functions (Figure III.1) 
that are each a one-parameter family of GFs in a single variable defined by 


horizontal GF: fn(u) = >» Fug 
vertical GF: f(z) := >, fa,k2" (ordinary case) 


fF) = Fae (exponential case). 
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Figure II.2._ The set Ws of the 32 binary words over the alphabet {11, MI} enumer- 
ated according to the number of occurrences of the letter ‘I’ gives rise to the bivariate 
counting sequence {Ws} = 1,5, 10, 10,5, 1. 


The terminology is transparently explained if the elements (/,,,,) are arranged as an 
infinite matrix, with f,,, placed in row n and column &, since the horizontal and 
vertical GFs appear as the GFs of the rows and columns respectively. Naturally, one 
has 


> fnlu)z" (ordinary BGF) 


{COS yer OS tS . 
2 tn) — (exponential BGF). 
nN: 


n 


Example W1.1. The ordinary BGF of binomial coefficients. The binomial coefficient (i) counts 
binary words of length n having k occurrences of a designated letter; see Figure III.2. In order 
to compose the bivariate GF, start from the simplest case of Newton’s binomial theorem and 
directly form the horizontal GFs corresponding to a fixed n: 


n 


(1) Wru) := >- (;)«! =(1+u)", 


k=0 
Then a summation over all values of n gives the ordinary BGF 


1 
@ Wesd= (te = Da tee = ay 


k,n>0 n>0 


Such calculations are typical of BGF manipulations. What we have done amounts to starting 
from a sequence of numbers, W,;~, determining the horizontal GFs W,(u) in (1), then the 
bivariate GF W(z, u) in (2), according to the scheme: 


Wik ~ Wilu) ~ Wz, u). 


The BGF in (2) reduces to the OGF (1 — 2z)~! of all words, as it should, upon setting u = 1. 
In addition, one can deduce from (2) the vertical GFs of the binomial coefficients cor- 
responding to a fixed value of k 


k 


n\ pn z 
Ba (;): ~ oe 


n>0 
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from an expansion of the BGF with respect to u 


1 k zk 
(3) W(c, w) = — eS 
= — +1 
=z1-uyey = isp UD) 
and the result naturally matches what a direct calculation would give. .................00- | 


> IIL1. The exponential BGF of binomial coefficients. This is 


(4) W(z,u) = » - kee a Sa fe a — ez(ltu) 
; k n!} n! : 
k,n 
The vertical GFs are ez‘ /k!. The horizontal GFs are (1 + u)", as in the ordinary case. J 


Example W1.2._ The exponential BGF of Stirling cycle numbers. As seen Example II.12, p. 121, 
the number P,, ; of permutations of size n having k cycles equals the Stirling cycle number [i]. 
a vertical EGF being 


n 


k 1 


From this, the exponential BGF is formed as follows (this revisits the calculations on p. 121): 


k 
(5) PG.a= POS BLO ae aa-0* 
k k : 


The simplification is quite remarkable but altogether quite typical, as we shall see shortly, in the 
context of a labelled set construction. The starting point is thus a collection of vertical EGFs 


and the scheme is now 


PAO ne PME) ~~ Pu). 


The BGF in (5) reduces to the EGF (1 — z)~! of all permutations, upon setting u = 1. 
Furthermore, an expansion of the BGF in terms of the variable z provides useful informa- 
tion; namely, the horizontal GF is obtained by Newton’s binomial theorem: 


P(Z,u) = same = > Paws, 
(6) n>0 x n>0 ue 
where Py(u) = ulutl1)---Utn-—1). 


This last polynomial is called the Stirling cycle polynomial of index n and it describes com- 
pletely the distribution of the number of cycles in all permutations of size n. In addition, the 
relation 


Py(u) = Py—1(u)(u + (n — 1)), 


is equivalent to the recurrence 


Aaa ae caren 


by which Stirling numbers are often defined and easily evaluated numerically; see also Ap- 
pendix A.8: Stirling numbers, p. 735. (The recurrence is susceptible to a direct combinatorial 
interpretation—add n either to an existing cycle or as a “new” singleton.) ................ | 
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Figure II.3. The various GFs associated with binomial coefficients (left) and Stir- 
ling cycle numbers (right). 


Concise expressions for BGFs, like (2), (3), (5), or (18), are summarized in Fig- 
ure III.3; they are invaluable for deriving moments, variance, and even finer character- 
istics of distributions, as we see next. The determination of such BGFs can be covered 
by a simple extension of the symbolic method, as will be detailed in Sections HI. 3 
and III. 4. 


III. 2. Bivariate generating functions and probability distributions 


Our purpose in this book is to analyse characteristics of a broad range of combi- 
natorial types. The eventual goal of multivariate enumeration is the quantification of 
properties present with high regularity in large random structures. 

We shall be principally interested in enumeration according to size and an auxil- 
iary parameter, the corresponding problems being naturally treated by means of BGFs. 
In order to avoid redundant definitions, it proves convenient to introduce the sequence 
of fundamental factors (@,)n>0, defined by 


(7) @n = 1 for ordinary GFs, @n =n! for exponential GFs. 


Then, the OGF and EGF of a sequence (f;,) are jointly represented as 


fO= Di fo— and fn = on z"1F@). 


n 


Definition II.2. Given a combinatorial class A, a (scalar) parameter is a function 
from A to Zso that associates to any object a € A an integer value x(a). The 
sequence 


An = card({a ¢A | lal =n, x(a) =k}), 


is called the counting sequence of the pair A, y. The bivariate generating function 
(BGF) of A, x is defined as 


zh ; 
A(z, u) ‘= > Ank U's 
n,k>0 n 


and is ordinary if @, = | and exponential if @, = n!. One says that the variable z 
marks size and the variable u marks the parameter y. 
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Naturally A(z, 1) reduces to the usual counting generating function A(z) associ- 
ated with A, and the cardinality of A, is expressible as 


An = @n[z" A(z, 1). 


III. 2.1. Distributions and moments. Within this subsection, we examine the 
relationship between probabilistic models needed to interpret bivariate counting se- 
quences and bivariate generating functions. The elementary notions needed are re- 
called in Appendix A.3: Combinatorial probability, p. 727. 


Consider a combinatorial class A. The uniform probability distribution over Ay 
assigns to any a € A, a probability equal to 1/A,. We shall use the symbol P to 
denote probability and occasionally subscript it with an indication of the probabilistic 
model used, whenever this model needs to be stressed: we shall then write P4, (or 
simply P,, if A is understood) to indicate probability relative to the uniform distribu- 
tion over A,. 


Probability generating functions. Consider a parameter 7. It determines over 
each A, a discrete random variable defined over the discrete probability space A,: 


An.k =, An.k 
An ye An,k ; 
Given a discrete random variable X, typically, a parameter y taken over a subclass Ap, 


we recall that its probability generating function (PGF) is by definition the quantity 
(9) pW) = >) P(X =bu*. 
k 


(8) Pa,” =k) = 


From (8) and (9), one has immediately: 


Proposition III.1 (PGFs from BGFs). Let A(z, u) be the bivariate generating func- 
tion of a parameter y defined over a combinatorial class A. The probability generat- 
ing function of x over A, is given by 


["AG w) 
> Pa, (¢ = uk = =o 
k 


[z"]A(z, 1)’ 


and is thus a normalized version of a horizontal generating function. 


The translation into the language of probability enables us to make use of which- 
ever intuition might be available in any particular case, while allowing for a natu- 
ral interpretation of data (Figure III.4). Indeed, instead of noting that the quantity 
381922055502195 represents the number of permutations of size 20 that have 10 
cycles, it is perhaps more informative to state the probability of the event, which is 
0.00015, i.e., about 1.5 per 10 000. Discrete distributions are conveniently represented 
by histograms or “bar charts”, where the height of the bar at abscissa k indicates the 
value of P{X = k}. Figure III.4 displays two classical combinatorial distributions 
in this way. Given the uniform probabilistic model that we have been adopting, such 
histograms are eventually nothing but a condensed form of the “stacks” corresponding 
to exhaustive listings, like the one displayed in Figure III.2. 
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Figure III.4._ Histograms of two combinatorial distributions. Left: the number of 
occurrences of a designated letter in a random binary word of length 50 (binomial 
distribution). Right: the number of cycles in a random permutation of size 50 (Stirling 
cycle distribution). 


Moments. Important information is conveyed by moments. Given a discrete ran- 
dom variable X, the expectation of f (X) is by definition the linear functional 


E(f(X)) = DUPIX =k} - f®). 
k 


The (power) moments are 


E(X") := DO P{X =k} -k’. 
k 


Then the expectation (or average, mean) of X, its variance, and its standard deviation, 
respectively, are expressed as 


E(X), V(X) =E(X’)-E(X)’, a (X) = V V(X). 


The expectation corresponds to what is typically seen when forming the arithmetic 
mean value of a large number of observations: this property is the weak law of large 
numbers (205, Ch X]. The standard deviation then measures the dispersion of values 
observed from the expectation and it does so in a mean-quadratic sense. 

The factorial moment defined for order r as 


(10) E (X(X — 1)---(X —r+1)) 


is also of interest for computational purposes, since it is obtained plainly by differen- 
tiation of PGFs (Appendix A.3: Combinatorial probability, p. 727). Power moments 
are then easily recovered as linear combinations of factorial moments, see Note III.9 
of Appendix A. In summary: 


Proposition III.2 (Moments from BGFs). The factorial moment of order r of a pa- 

rameter x is determined from the BGF A(z, u) by r-fold differentiation followed by 

evaluation at 1: 

[z"]a, AG, w)| 
[z"]A(z, 1) 


u=1 


tA, XX -De Yar += 
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In particular, the first two moments satisfy 
[2"]Ou AZ, Wlu=t 


EA, (4) "JAG 
eee "IAG. , "vA, wlan 
aan [z"]A(z 1) “AGH” 


the variance and standard deviation being determined by 


Vy) = 6 (x)? = E(x’) - E(x)’. 


Proof. The PGF p,(u) of y over A, is given by Proposition III.1. On the other hand, 
factorial moments are on general grounds obtained by differentiation and evaluation 
at u = 1. The result follows. a 


is) 


give, after a simple normalization (by @, - [z”]A(z, 1)), the factorial moments: 


In other words, the quantities 


Qh [= On: ([2"] ak A(z, u) 


1 
(x(x = Vs @ 4D) = FON. 


n 
Most notably, a) is the cumulated value of y over all objects of A,: 


OM = oq -[2"] WAG Wluat = D> x@) = An Ea, (0). 
aeA, 


Accordingly, the GF (ordinary or exponential) of the Q is sometimes named the 
cumulative generating function. It can be viewed as an unnormalized generating func- 
tion of the sequence of expected values. These considerations explain Wilf’s sugges- 
tive motto quoted on p. 151: “Generating functions find averages, etc’. (The “etc” can 
be interpreted as a token for higher moments and probability distributions.) 

> IIL.2. A combinatorial form of cumulative GFs. One has 


fag Zz 
2%) = Ey, An = DY) x@)—. 
7 On A Ma] 
ae 
where wy, = 1 (ordinary case) or @, = n! (exponential case). J 


Example V1.3. Moments of the binomial distribution. The binomial distribution of index n can 
be defined as the distribution of the number of as in a random word of length n over the binary 
alphabet {a, b}. The determination of moments results easily from the ordinary BGF, 


1 


W(z, u) = ————_.. 
l1-—z-—zu 
By differentiation, one finds 
a” riz” 
—W(z, u) a 
our v=) (1 —2z)"+ 
Coefficient extraction then gives the form of the factorial moments of orders 1, 2,3,...,7 as 


n n(n — 1) n(n — 1)(n — 2) r! (n 
2’ 4° 8 os =(*). 
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In particular, the mean and the variance are hn and dn. The standard deviation is thus 5/n 
which is of a smaller order than the mean: this indicates that the distribution is somehow con- 
centrated around its mean value, as suggested by Figure IIL.4. ............... 0. eee eee ee | 


> IIL.3. De Moivre’s approximation of the binomial coefficients. The fact that the mean and 
the standard deviation of the binomial distribution are respectively 5n and 5 n suggests we 


examine what goes on at a distance of x standard deviations from the mean. Consider for 
simplicity the case of n = 2v even. From the ratio 


zy —~lyq-2)...q—-#1 
vv, 0) 1). _ A= pa= pd - 5) 


@) ~ a+ba+5--045 - 


the approximation log(1 + x) =x + O(x”) shows that, for any fixed y € R, 


. cn) —y?/2 
lim mW =e hee 
n> oo, €=v+y/v/2 (7) 
(Alternatively, Stirling’s formula can be employed.) This Gaussian approximation for the bino- 


mial distribution was discovered by Abraham de Moivre (1667-1754), a close friend of Newton. 
General methods for establishing such approximations are developed in Chapter IX. J 


Example MIl.4. | Moments of the Stirling cycle distribution. Let us return to the example of 
cycles in permutations which is of interest in connection with certain sorting algorithms like 
bubble sort or insertion sort, maximum finding, and in situ rearrangement [374]. 

We are dealing with labelled objects, hence exponential generating functions. As seen 
earlier on p. 155, the BGF of permutations counted according to cycles is 


P(z,u)=(1—-—z)™. 


By differentiating the BGF with respect to u, then setting u = 1, we next get the expected 
number of cycles in a random permutation of size n as a Taylor coefficient: 


(11) En(x) =e") log Sef weer, 

1-z 1-z 2; n 
which is the harmonic number H,. Thus, on average, a random permutation of size n has about 
logn + y cycles, a well-known fact of discrete probability theory, derived on p. 122 by means 
of horizontal generating functions. 


For the variance, a further differentiation of the bivariate EGF gives 


2Z 
(12) D Enel = De" = a (lee) 


_ 
n>0 a 


From this expression and Note III.4 (or directly from the Stirling cycle polynomials of p. 155), 
a calculation shows that 


1 aol x2 1 
(13) on = a = Das =logn+y = 40(2), 


Thus, asymptotically, 

On ~ Jlogn. 
The standard deviation is of an order smaller than the mean, and therefore large deviations from 
the mean have an asymptotically negligible probability of occurrence (see below the discussion 


of moment inequalities). Furthermore, the distribution is asymptotically Gaussian, as we shall 
seen: Chapter LX) pi 644 eee fete ahd ce tee heretics ecatese bore hd eeleipreatets Dideenaleied edhe wala Wh | 
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> ILL.4. Stirling cycle numbers and harmonic numbers. By the “exp—log trick” of Chapter I, 
p. 29, the PGF of the Stirling cycle distribution satisfies 


1 
—utu + 1)-- ota =ein( ott HP ES HO) 4 : ~). u=l1+o0 
n! 


where H” is the generalized harmonic number am , J ’- Consequently, any moment of the 
distribution is a polynomial in generalized harmonic numbers; compare (11) and (13). Fur- 
thermore, the kth moment satisfies Ep, ( x) ~ (log nk . (The same technique expresses the 


Stirling cycle number [i] as a polynomial in generalized harmonic numbers H” 1) 


Alternatively, start from the expansion of (1 — z)~% and differentiate repeatedly with re- 
spect to a; for instance, one has 


1 1 1 1 n+a-1 
1= eo |, = a ips 
i) neces (C++ +—)( n ): 


which provides (11) upon setting a = 1, while the next differentiation gives (13). dq 


The situation encountered with cycles in permutations is typical of iterative (non- 
recursive) structures. In many other cases, especially when dealing with recursive 
structures, the bivariate GF may satisfy complicated functional equations in two vari- 
ables (see the example of path length in trees, Section III. 5 below), which means we 
do not know them explicitly. However, asymptotic laws can be determined in a large 
number of cases (Chapter IX). In all cases, the BGFs are the central tool in obtain- 
ing mean and variance estimates, since their derivatives evaluated at uv = 1 become 
univariate GFs that usually satisfy much simpler relations than the BGFs themselves. 


Ill. 2.2. Moment inequalities and concentration of distributions. Qualitative- 
ly speaking, families of distributions can be classified into two categories: (i) distri- 
butions that are spread, i.e., the standard deviation is of order at least as large as the 
mean (e.g.the uniform distributions over [0..7], which have totally flat histograms); 
(ii) distributions for which the standard deviation is of an asymptotic order smaller 
than the mean (e.g., the Stirling cycle distribution, Figure III.4, and the binomial distri- 
bution, Figure III.5.) Such informal observations are indeed supported by the Markov— 
Chebyshev inequalities, which take advantage of information provided by the first two 
moments. (A proof is found in Appendix A.3: Combinatorial probability, p. 727.) 


Markov—Chebyshev inequalities. Let X be a non-negative random variable and Y 
an arbitrary real variable. One has for any t > 0: 


P{X > tE(X)} < 


(Markov inequality) 


1 
t 
1 


P{|¥ —E()| = to (Y)} 


A 


(Chebyshev inequality). 


= 2 


~ 


This result informs us that the probability of being much larger than the mean must 
decay (Markov) and that an upper bound on the decay is measured in units given by 
the standard deviation (Chebyshev). 

The next proposition formalizes a concentration property of distributions. It ap- 
plies to a family of distributions indexed by the integers. 
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Figure III.5. Plots of the binomial distributions for n = 5,...,50. The horizontal 
axis is normalized (by a factor of 1/n) and rescaled to 1, so that the curves display 


{p4e =x}, forx =o, 4, 2 


>monree 


Proposition II.3 (Concentration of distribution). Consider a family of random vari- 
ables Xy, typically, a scalar parameter y on the subclass A,. Assume that the means 
Ln = E(X,) and the standard deviations on = 0 (Xp) satisfy the condition 


in oO: 
n—->+oo Ln 


Then the distribution of X» is concentrated in the sense that, for any € > 0, there 
holds 


Xn 
(14) lim Pjl—e<—<1l+ef=l1. 
n— +00 Ln 
Proof. The result is a direct consequence of Chebyshev’s inequality. | 


The concentration property (14) expresses the fact that values of X, tend to be- 
come closer and closer (in relative terms) to the mean “wy, as n increases. Another 
figurative way of describing concentration, much used in random combinatorics, is to 
say that “X,/{n tends to I in probability’; in symbols: 


En 


When this property is satisfied, the expected value is in a strong sense a typical value— 
this fact is an extension of the weak law of large numbers of probability theory. 


Concentration properties of the binomial and Stirling cycle distributions. The 
binomial distribution is concentrated, since the mean of the distribution is n/2 and 
the standard deviation is ./n/4, a much smaller quantity. Figure III.5 illustrates con- 
centration by displaying the graphs (as polygonal lines) associated to the binomial 
distributions for n = 5,...,50. Concentration is also quite perceptible on simula- 
tions as n gets large: the table below describes the results of batches of ten (sorted) 
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n 
simulations from the binomial distribution ea () | : 
k=0 
n = 100 39, 42, 43, 49,50, 52,54, 55,55, 57 
n = 1000 487, 492, 494, 494, 506, 508, 512, 516, 527, 545 
n= 10000 | 4972, 4988, 5000, 5004, 5012, 5017, 5023, 5025, 5034, 5065 
n = 100000 | 49798, 49873, 49968, 49980, 49999, 50017, 50029, 50080, 50101, 50284; 


the maximal deviations from the mean observed on such samples are 22% (n = 107), 
9% (n = 10°), 1.3% (n = 10*), and 0.6% (n = 10°). Similarly, the mean and 
variance computations of (11) and (13) imply that the number of cycles in a random 
permutation of large size is concentrated. 


Finer estimates on distributions form the subject of our Chapter IX dedicated to 
limit laws. The reader may get a feeling of some of the phenomena at stake when 
examining Figure II.5 and Note IIL3, p. 160: the visible emergence of a continu- 
ous curve (the bell-shaped curve) corresponds to a common asymptotic shape for the 
whole family of distributions—the Gaussian law. 


Ill. 3. Inherited parameters and ordinary MGFs 


In this section and the next, we address the question of determining BGFs directly 
from combinatorial specifications. The answer is provided by a simple extension of 
the symbolic method, which is formulated in terms of multivariate generating func- 
tions (MGFs). Such generating functions have the capability of taking into account a 
finite collection (equivalently, a vector) of combinatorial parameters. Bivariate gener- 
ating functions discussed earlier appear as a special case. 


Il. 3.1. Multivariate generating functions (MGFs). The theory is best devel- 
oped in full generality for the joint analysis of a fixed finite collection of parameters. 


Definition III.3. Consider a combinatorial class A. A (multidimensional) parameter 
X = (M1.---> Za) on the class is a function from A to the set Ze, of d-tuples of 
natural numbers. The counting sequence of A with respect to size and the parameter x 
is then defined by 


An, ky,...ska = card {a | la| Hn, x(a) = ki, oe -5 Xa(@) a ka} : 


We sometimes refer to such a parameter as a “multiparameter” when d > 1, and 
a “simple” or “scalar” parameter otherwise. For instance, one may take the class P 
of all permutations a, and for x; (j = 1,2, 3) the number of cycles of length j ino. 
Alternatively, we may consider the class WV of all words w over an alphabet with four 
letters, {a1,..., a4} and take for y; (j = 1,..., 4) the number of occurrences of the 
letter a; in w, and so on. 

The multi-index convention employed in various branches of mathematics greatly 


simplifies notations: let x = (x1,...,xq@) be a vector of d formal variables and k = 
(k1,...,kq) be a vector of integers of the same dimension; then, the multipower x* is 
defined as the monomial 

k ._ Ai ko ka 
(15) XPS Xp Ny XS. 


With this notation, we have: 
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Definition IIL.4. Let A, be a multi-index sequence of numbers, where k € N¢. 
The multivariate generating function (MGF) of the sequence of either ordinary or 
exponential type is defined as the formal power series 


A(z,u) = >) Anxukz" (ordinary MGF) 
(16) dee a" 
A(z,u) = >) Anus (exponential MGF). 
aR n! 


Given a class A and a parameter x, the MGF of the pair (A, y) is the MGF of 
the corresponding counting sequence. In particular, one has the combinatorial forms: 


A(z,u) = > u“z!4l_— (ordinary MGF; unlabelled case) 
(17) aeA ala 
A(z,u) = >; uw jal (exponential MGF; labelled case). 
a|! 
aeA 


One also says that A(z,u) is the MGF of the combinatorial class with the formal 
variable u ; marking the parameter x; and z marking size. 


From the very definition, with 1 a vector of all 1’s, the quantity A(z, 1) coincides 
with the generating function of A, either ordinary or exponential as the case may be. 
One can then view an MGF as a deformation of a univariate GF by way of a vector u, 
with the property that the multivariate GF reduces to the univariate GF at u = 1. If all 
but one of the uj; are set to 1, then a BGF results; in this way, the symbolic calculus 
that we are going to develop gives full access to BGFs (and, from here, to moments). 


> IIL5. Special cases of MGFs. The exponential MGF of permutations with w;, v2 marking 
the number of 1—cycles and 2-cycles respectively is 


2 
exp (1) — Dz + 2 - DF) 
l-z ; 

(This will be proved later in this chapter, p. 187.) The formula is checked to be consistent with 
three already known special cases derived in Chapter II: (i) setting uw; = uz = 1 gives back 
the counting of all permutations, P(z, 1,1) = (1 — “)-'; as it should; (ii) setting wu; = 0 and 
uz = 1 gives back the EGF of derangements, namely e~*/(1 — z); (iii) setting uy = u2 = 
0 gives back the EGF of permutations with cycles all of length greater than 2, P(z,0,0) = 


(18) P(Z,uy,U2) = 


ene e/2 /(1 — z), a generalized derangement GF. In addition, the particular BGF 


elu-Vz 
P(Z,u,1)= , 
1—z 
enumerates permutations according to singleton cycles. This last BGF interpolates between the 
EGF of derangements (u = 0) and the EGF of all permutations (u = 1). <q 


Ill. 3.2. Inheritance and MGFs. Parameters that are inherited from substruc- 
tures (definition below) can be taken into account by a direct extension of the symbolic 
method. With a suitable use of the multi-index conventions, it is even the case that the 
translation rules previously established in Chapters I and II can be copied verbatim. 
This approach provides a large quantity of multivariate enumeration results that follow 
automatically by the symbolic method. 
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Definition III.5. Let (A, v), (B,&), (C, ¢) be three combinatorial classes endowed 
with parameters of the same dimension d. The parameter y is said to be inherited in 
the following cases. 


e Disjoint union: when A = B+ C, the parameter y is inherited from é, C iff 
its value is determined by cases from €, ¢: 


¢(@) ifoeB 
C(@) ifwec. 


X(@) = 


e Cartesian product: when A = B x C, the parameter y is inherited from é, ¢ 
iff its value is obtained additively from the values of €, €: 


XB) =o(B) +6(7). 


e Composite constructions: when A = &{B}, where & is a metasymbol repre- 
senting any of SEQ, MSET, PSET, CYC, the parameter y is inherited from € 
iff its value is obtained additively from the values of € on components; for 
instance, for sequences: 


X(B1,---» Br) = 6(B1) +++ + E(B;). 


With a natural extension of the notation used for constructions, we shall write 
(A, x) = (Bo) + (C6), (A, x) = (B,0) x (C0), (A, x) = RUB, CO}. 


This definition of inheritance is seen to be a natural extension of the axioms that 
size itself has to satisfy (Chapter I): size of a disjoint union is defined by cases; size 
of a pair, and similarly of a composite construction, is obtained by addition. 


Next, we need a bit of formality. Consider a pair (A, 7), where A is a combi- 
natorial class endowed with its usual size function |- | and y = (11,..., 7a) is a 
d-dimensional (multi)parameter. Write yo for size and zg for the variable marking 
size (previously denoted by z). The key point is to define an extended multiparameter 
XY = (yo. 11,---> Xa); that is, we treat size and parameters on an equal opportunity 
basis. Then the ordinary MGF in (16) assumes an extremely simple and symmetrical 
form: 


(19) A(z) = >) Anz = >. gr), 

k aeA 
Here, the indeterminates are the vector z = (zo, Z1,..-, Za), the indices are k = 
(ko, k1, ..., ka), where ko indexes size (previously denoted by n) and the usual multi- 
index convention introduced in (15) is in force: 
(20) zk s= ziogtl negli 


but it is now applied to (d + 1)-dimensional vectors. With this convention, we have: 


Theorem III.1 (Inherited parameters and ordinary MGFs). Let A be a combinatorial 
class constructed from B,C, and let y be a parameter inherited from € defined on 
B and (as the case may be) from ¢ on C. Then the translation rules of admissible 
constructions stated in Theorem I.1, p. 27, are applicable, provided the multi-index 
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convention (19) is used. The associated operators on ordinary MGFs are then (p(k) 
is the Euler totient function, defined on p. 721): 


Union: A=B+C => A(z) = B(z)+C(a), 
Product: A=BxC => A(z) = B(z)-C), 


Sequence: A= SEQ(B) = A(Z= 


1— Biz)’ 
Se) 
Powerset: A=PSET(B) => A(z) =exp (> eo ). 
t=1 
Multiset: A=MSET(B) => A(z) =exp > rBta! ) 
t=1 
7 mr) 1 
Cycle: A=Cyc(bB) = A@= pa ; log — BaD’ 


Proof. For disjoint unions, one has 


AQ) = FO = FO 4 FO, 


aeA BeB yeC 


since inheritance is defined by cases on unions. For cartesian products, one has 


A(z) = > @®) — =, ge P) x Ye, 
aeA BeB yeC 
since inheritance corresponds to additivity on products. 

The translation of composite constructions in the case of sequences, powersets, 
and multisets is then built up from the union and product schemes, in exactly the 
same manner as in the proof of Theorem I.1. Cycles are dealt with by the methods of 
Appendix A.4: Cycle construction, p. 729. | 


The multi-index notation is a crucial ingredient for developing the general theory 
of multivariate enumerations. When we work with only a small number of parameters, 
typically one or two, we will however often find it convenient to return to vectors of 
variables like (z, u) or (z, u, v). In this way, unnecessary subscripts are avoided. 

The reader is especially encouraged to study the treatment of integer composi- 
tions in Examples III.5 and III.6 below carefully, since it illustrates the power of the 
multivariate symbolic method, in its bare bones version. 


Example V1.5. Integer compositions and MGFs I. The class C of all integer compositions 
(Chapter I) is specified by 
C = SEQ(Z), TZ = SEQs1 (4), 
where Z is the set of all positive numbers. The corresponding OGFS are 
a 


C(z) = l=: 


IQZ)= 


1 
1 — T(z)’ 
so that Cy =2"—! (n > 1). Say we want to enumerate compositions according to the number y 
of summands. One way to proceed, in accordance with the formal definition of inheritance, is 
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as follows. Let € be the parameter that takes the constant value 1 on all elements of Z. The 
parameter y on compositions is inherited from the (almost trivial) parameter € = 1 defined on 
summands. The ordinary MGF of (Z, €) is 


Lu 
I(z,u)= mut fut out: = 


l-z 
Let C(z, u) be the BGF of (C, v7). By Theorem III.1, the schemes translating admissible con- 
structions in the univariate case carry over to the multivariate case, so that 


1 1 1-z 


1-T(z,u) at prs ~ l—z(u+1) 


(21) CZ, u) = 


Ft vorlal. icp thee te ae na il eo eae Se tal oO Be a a ea ad Sens || 


Markers. There is an alternative way of arriving at MGFs, as in (21), which is 
important and will be of much use thoughout this book. A marker (or mark) in a spec- 
ification & is a neutral object (i.e., an object of size 0) attached to a construction or an 
atom by a product. Such a marker does not modify size, so that the univariate counting 
sequence associated to & remains unaffected. On the other hand, the total number of 
markers that an object contains determines by design an inherited parameter, so that 
Theorem III.1 is automatically applicable. In this way, one may decorate specifica- 
tions so as to keep track of “interesting” substructures and get BGFs automatically. 
The insertion of several markers similarly gives MGFs. 

For instance, say we are interested in the number of summands in compositions, 
as in Example III.5 above. Then, one has an enriched specification, and its translation 
into MGF, 


1 


(22) GSSEOMM SEO) ia CW reas 


based on the correspondence: Z KH zu b U. 


Example V1.6. Integer compositions and MGFs II. Consider the double parameter y = 
(41; X2) where y; is the number of parts equal to 1 and yz the number of parts equal to 2. 
One can write down an extended specification, with “1, a combinatorial mark for summands 
equal to 1 and w2 for summands equal to 2, 


C = SEQ (mz oe yee St023(2)) 
(23) 1 


Do @ye-tioe $25 Srl): 
where u ; (j = 1, 2) records the number of marks of type jy ;. 
Similarly, let « mark each summand and “1, mark summands equal to 1. Then, one has, 
1 
1 — (uuyz + uz2(1 —z)-1)’ 


(24) €=Seo( um Z+ wStQ2212)) = Cem") = 


where u keeps track of the total number of summands and wu, records the number of summands 
equal to 1. 
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MGFs obtained in this way via the multivariate extension of the symbolic method can then 
provide explicit counts, after suitable series expansions. For instance, the number of composi- 
tions of n with k parts is, by (21), 


eae we) 
1-—(1_+u)z k k k-1 

a result otherwise obtained in Chapter I by direct combinatorial reasoning (the balls-and-bars 
model). The number of compositions of n containing k parts equal to 1 is obtained from the 
special case u7 = | in (23), 

1 
[2" uk] a 
(I-z) 
where the last OGF closely resembles a power of the OGF of Fibonacci numbers. 

Following the discussion of Section III. 2, such MGFs also carry complete information 
about moments. In particular, the cumulated value of the number of parts in all compositions 
of n has OGF (2 

zd —z 
OuC(Z, W)ly=i = G37" 
since cumulated values are obtained via differentiation of a BGF. Therefore, the expected num- 
ber of parts in a random composition of n is exactly (for n > 1) 
1 2] zd — z) - 1 
gn-l (1—2z)2 2 
One further differentiation will give rise to the variance. The standard deviation is found to 
be svn — 1, which is of an order (much) smaller than the mean. Thus, the distribution of the 
number of summands in a random composition satisfies the concentration property as n > oo. 

In the same vein, the number of parts equal to a fixed number r in compositions is deter- 

mined by 


l-—uz- 


(n+ 1). 


=f 
C =SEQ (u2" os Seaur(2)) = C(zu)= (: = (= es v:")) 


It is then easy to pull out the expected number of r-summands in a random composition of 
size n. The differentiated form 
r 2, 
BUC, wut = doe 
gives, by partial fraction expansion, 
—Pp—2 g-r-1 _ ,9-1-2 
(1 — 2z)2 + 1—2z 
for a polynomial q(z) that we do not need to make explicit. Extracting the nth coefficient of 
the cumulative GF 6,C(z, 1) and dividing by Died yields the mean number of r—parts in a 
random composition. Another differentiation gives access to the second moment. One obtains 
the following proposition. 


OuC (Zz, U)ly=1 = + q(z), 


Proposition II.4 (Summands in integer compositions). The total number of summands in a 
random composition of size n has mean s(n +1) and a distribution that is concentrated around 
the mean. The number of r summands in a composition of size n has mean 


7 . 
+t + O(1); 


and a standard deviation of order ./n, which also ensures concentration of distribution. 
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Figure II.6. A random composition of n = 100 represented as a ragged landscape 
(top); its associated profile 12991231041517110! | defined as the partition obtained by 
sorting the summands (bottom). 


Results of a simulation illustrating the proposition are displayed in Figure III.6 to which 
Note III.6 below adds further comments. ........... 00.00. eee cece eee e eee tenes a 


> IIL6. The profile of integer compositions. From the point of view of random structures, 
Proposition III.4 shows that random compositions of large size tend to conform to a global 
“profile”. With high probability, a composition of size n should have about n/4 parts equal to 1, 
n/8 parts equal to 2, and so on. Naturally, there are statistically unavoidable fluctuations, and 
for any finite n, the regularity of this law cannot be perfect: it tends to fade away, especially with 
regard to largest summands that are logy(n) + O(1) with high probability. (In this region mean 
and standard deviation both become of the same order and are O(1), so that concentration no 
longer holds.) However, such observations do tell us a great deal about what a typical random 
composition must (probably) look like—it should conform to a “geometric profile”, 


2/4 gn/8 311/16 qn/32 


Here are for instance the profiles of two compositions of size n = 1024 drawn uniformly at 
random: 


1250 9138 370 429 515 61074 go 9! and 1253 72136 368 431 513 68 2B giol 102. 
These are to be compared with the “ideal” profile 
1256 3128 304 432 516 6° a4 2 9! 


It is a striking fact that samples of a very few elements or even just one element (this would 
be ridiculous by the usual standards of statistics) are often sufficient to illustrate asymptotic 
properties of large random structures. The reason is once more to be attributed to concentration 
of distributions whose effect is manifest here. Profiles of a similar nature present themselves 
among objects defined by the sequence construction, as we shall see throughout this book. 
(Establishing such general laws is usually not difficult but it requires the full power of complex 
analytic methods developed in Chapters [V—VIII.) dq 


> IIL.7. Largest summands in compositions. For any € > 0, with probability tending to 1 
as n — oo, the largest summand in a random integer composition of size n is in the interval 
[(. — €) logy n, (1 + €) log, nj. (Hint: use the first and second moment methods. More precise 
estimates are obtained by the methods of Example V.4, p. 308.) dq 
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RK BGF (A(z, u)) cumulative GF (Q.(z)) 
; 1 2. = B(z) 
“ro! TBO AO FO= Taye 
e 1 x1 k 
exp { DST Be) ss 
PSET: ae A(z)» >) Be) 
Ic + uz)Bn k=1 
n=l 
exp y Be ) SS 
MSET : Sar ese A(z): >) B(e*) 
Ic — uz") k=1 
n=l 
— 9k) 1 ~ B(zh) 
Cyc: 2 ; TEED a TELS 2° OT Bab 


Figure III.7. Ordinary GFs relative to the number of components in A = &(B). 


Simplified notation for markers. It proves highly convenient to simplify nota- 
tions, much in the spirit of our current practice, where the atom Z is reflected by 
the name of the variable z in GFs. The following convention will be systematically 
adopted: the same symbol (usually u,v, uj, uz...) is freely employed to designate a 
combinatorial marker (of size 0) and the corresponding marking variable in MGFs. 

For instance, we can write directly, for compositions, 


C = SEQ(u SEQs1 Z)), C = SEQ(uu, Z + u SEQs2 Z)), 


where u marks all summands and uv; marks summands equal to 1, giving rise to (22) 
and (24) above. The symbolic scheme of Theorem II.1 invariably applies to enumer- 
ation according to the number of markers. 


II. 3.3. Number of components in abstract unlabelled schemas. Consider a 
construction A = &(B), where the metasymbol & designates any standard unlabelled 
constructor among SEQ, MSET, PSET, Cyc. What is sought is the BGF A(z, wu) of 
class A, with u marking each component. The specification is then of the form 


A= RUB), RK = SEQ, MSET, PSET, CYc. 


Theorem III.1 applies and yields immediately the BGF A(z, u). In addition, differ- 
entiating with respect to u then setting u = 1 provides the GF of cumulated values 
(hence, in a non-normalized form, the OGF of the sequence of mean values of the 
number of components): 


(6) 
Q(z) = 5, A& u) 


u=1 
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Figure III.8. A random partition of size n = 100 has an aspect rather different from 
the profile of a random composition of the same size (Figure III.6). 


In summary: 


Proposition III.5 (Components in unlabelled schemas). Given a construction, A = 
&(B), the BGF A(z, u) and the cumulated GF Q(z) associated to the number of com- 
ponents are given by the table of Figure LII.7. 


Mean values are then recovered with the usual formula, 
[z”]Q(z) 
[z"JA(z) | 


> ILIL.8. r—Components in abstract unlabelled schemas. Consider unlabelled structures. The 
BGF of the number of r—components in A = &{B} is given by 


4 A, (# components) = 


_ or \ By 
A(z, u) = (1 = B(z) = (u = 1)Brz") 7 A(z, u) = A(z) ‘: ( 1 ‘ ) : 


1—uzl 


in the case of sequences (R = SEQ) and multisets (R = MSET), respectively. Similar formulae 
hold for the other basic constructions and for cumulative GFs. J 


> IIL9. Number of distinct components in a multiset. The specification and the BGF are 


n Bn 
I] (1 + u SEQs1(A)) a: I] (1+ =) 


BEB n>1 


as follows from first principles. dq 


As an illustration of Proposition III.5, we discuss the profile of random partitions 
(Figure III.8). 


Example 11.7. The profile of partitions. Let P = MSET(Z) be the class of all integer 
partitions, where Z = SEQ; (Z) represents integers in unary notation. The BGF of P with u 
marking the number y of parts (or summands) is obtained from the specification 

© uk k 


P=MSEt(uZ) => P(z,u) =exp St 
a4 
k=1 
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Figure III.9. The number of parts in random partitions of size 1,...,500: exact 
values of the mean and simulations (circles, one for each value of n). 


Equivalently, from first principles, 


oo oo 
P= [[seewm,) = |] 
n=1 


n=1 


1 


l—uzl 


The OGF of cumulated values then results from the second form of the BGF by logarithmic 
differentiation: 


CO zk 
(25) Q(z) = P@)- >) —>. 
1—z 
k=] 
Now, the factor on the right in (25) can be expanded as 
ioe) zk ie) 
_ n 
y= Dane", 
k=1 n=l 
with d(n) the number of divisors of n. Thus, the mean value of y is 
ee 
(26) En(x) = 5 = dN Pins 
j=l 
The same technique applies to the number of parts equal to r. The form of the BGF is 
we ~ 1-2" 
P = SEQ(UZ,) x [] SEQGn) => P(z,u) = - P@), 
1—uz 


nr 
which implies that the mean value of the number 7 of r—parts satisfies 


te 1 5a 1 
Bu) = s-2"1(PO)- -)= (Pr r+ Phar + Ph apes). 
n 


l-z Ph 
From these formulae and a decent symbolic manipulation package, the means are calculated 
easily up to values of n well into the range of several thousand. .................0.-00008 | 


The comparison between Figures III.6 and III.8 shows that different combinatorial 
models may well lead to rather different types of probabilistic behaviours. Figure III.9 
displays the exact value of the mean number of parts in random partitions of size n = 
1,...,500, (as calculated from (26)) accompanied with the observed values of one 
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Figure III.10. Two partitions of P1999 drawn at random, compared to the limiting 
shape ‘¥ (x) defined by (27). 


random sample for each value of n in the range. The mean number of parts is known 
to be asymptotic to 
/nlogn 


a /2/3° 
and the distribution, though it admits a comparatively large standard deviation O (./n), 
is still concentrated, in the technical sense of the term. We shall prove some of these 
assertions in Chapter VIII, p. 581. 
In recent years, Vershik and his collaborators [152, 595] have shown that most in- 
teger partitions tend to conform to a definite profile given (after normalization by ./n) 
by the continuous plane curve y = ‘¥ (x) defined implicitly by 


(27) y= VQ) iff e *teV=1, a=—. 


This is illustrated in Figure III.10 by two randomly drawn elements of Pjo00 repre- 
sented together with the “most likely” limit shape. The theoretical result explains the 
huge differences that are manifest on simulations between integer compositions and 
integer partitions. 

The last example of this section demonstrates the application of BGFs to estimates 
regarding the root degree of a tree drawn uniformly at random among the class G,, of 
general Catalan trees of size n. Tree parameters such as number of leaves and path 
length that are more global in nature and need a recursive definition will be discussed 
in Section II. 5 below. 


Example WII.8. Root degree in general Catalan trees. Consider the parameter y equal to 
the degree of the root in a tree, and take the class G of all plane unlabelled trees, i.e., general 
Catalan trees. The specification is obtained by first defining trees (G), then defining trees with a 
mark for subtrees (G°) dangling from the root: 


va 
= G = 
G = Z x SEQ) = Eo 


G° = Z x SEQ(UG) CE aa’ 
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This set of equations reveals that the probability that the root degree equals r is 


1 7 r 2n-3-r r 
Balt = thle NG@! = —_/( n-2 )~ = 
n 


this by Lagrange inversion and elementary asymptotics. Furthermore, the cumulative GF is 
found to be 
zG(z) 


(1- G@)*" 
The relation satisfied by G entails a further simplification, 


creat tht 
Q(z) = 7E® = (- - 1) G(z) -1, 


so that the mean root degree admits a closed form, 


Q(z) = 


1 
En(x) = Ge (Gn41 = Gn) = 2 
n 


a quantity clearly asymptotic to 3. 
A random plane tree is thus usually composed of a small number of root subtrees, at least 
one of which should accordingly be fairly large. ......... 0... c cece eee eee eee | 


IlI.4. Inherited parameters and exponential MGFs 


The theory of inheritance developed in the last section applies almost verbatim to 
labelled objects. The only difference is that the variable marking size must carry a fac- 
torial coefficient dictated by the needs of relabellings. Once more, with a suitable use 
of multi-index conventions, the translation mechanisms developed in the univariate 
case (Chapter IT) remain in force, this in a way that parallels the unlabelled case. 


Let us consider a pair (A, y), where A is a labelled combinatorial class endowed 
with its size function |-| and y = (71,..., 7a) is a d-dimensional parameter. As 
before, the parameter y is extended into y by inserting size as zeroth coordinate and 
a vector Z = (Zo, ..., Zq) of d + | indeterminates is introduced, with z 9 marking size 
and z; marking y;. Once the multi-index convention of (20) defining z¥ has been 
brought into play, the exponential MGF of (.A, 7) (see Definition III.4, p. 164) can be 
rephrased as 


7k 
(28) A(z) = DAK =. 
k aeA 
This MGF is exponential in z (alias zo) but ordinary in the other variables; only the 
factorial kg! is needed to take into account relabelling induced by labelled products. 
We a priori restrict attention to parameters that do not depend on the absolute 
values of labels (but may well depend on the relative order of labels): a parameter is 
said to be compatible if, for any a, it assumes the same value on any labelled object a 
and all the order-consistent relabellings of a. A parameter is said to be inherited if it 
is compatible and it is defined by cases on disjoint unions and determined additively 
on labelled products—this is Definition II.5 (p. 165) with labelled products replacing 
cartesian products. In particular, for a compatible parameter, inheritance signifies 
additivity on components of labelled sequences, sets, and cycles. We can then cut- 
and-paste (with minor adjustments) the statement of Theorem IIT.1, p. 165: 


gia) 


lal!” 
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Theorem III.2 (Inherited parameters and exponential MGFs). Let A be a labelled 
combinatorial class constructed from B,C, and let y be a parameter inherited from 
€ defined on B and (as the case may be) from ¢ on C. Then the translation rules of 
admissible constructions stated in Theorem II.1, p. 103, are applicable, provided the 
multi-index convention (28) is used. The associated operators on exponential MGFs 
are then: 


Union: A=B+C ==> A(z) = Biz)+C(z) 

Product; A=BxC => A(z)= B(Z)-C(z) 

Sequence: A=SEQ(B) = A= T-B@ 
=—=>7 


Set: A = SET(B) A(z) = exp (B(z)). 
Proof. Disjoint unions are treated in a similar manner to the unlabelled multivariate 
case. Labelled products result from 


! 7 
lal Beg NBL Iv J GBI + 17D! 


and the usual translation of binomial convolutions that reflect labellings by means of 
products of exponential generating functions (like in the univariate case detailed in 
Chapter II). The translation for composite constructions is then immediate. | 


This theorem can be exploited to determine moments, in a way that entirely par- 
allels its unlabelled counterpart. 


Example 111.9. The profile of permutations. Let P be the class of all permutations and y the 
number of components. Using the concept of marking, the specification and the exponential 
BGEF are 


P = SET (uCYC(Z)) => P(z, u) = exp (« log j ~) =(1-z)", 


as was already obtained by an ad hoc calculation in (5). We also know (p. 160) that the mean 
number of cycles is the harmonic number Hy and that the distribution is concentrated, since the 
standard deviation is much smaller than the mean. 

Regarding the number 7 of cycles of length r, the specification and the exponential BGF 
are now 


P = Set (Cycyz, (Z) + u Cyc=,(Z)) 


(29) eu-Wz"/r 


1 gr 
=> (aoa) = exp (log = += )=) = = 


The EGF of cumulated values is then 


ra | 


(30) Q@) =— 7. 


The result is a remarkably simple one: In a random permutation of size n, the mean number 
of r—cycles is equal to 1/r for anyr <n. 

Thus, the profile of a random permutation, where profile is defined as the ordered sequence 
of cycle lengths, departs significantly from what has been encountered for integer compositions 
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Figure II.11. The profile of permutations: a rendering of the cycle structure of six 
random permutations of size 500, where circle areas are drawn in proportion to cycle 
lengths. Permutations tend to have a few small cycles (of size O(1)), a few large ones 
(of size @(n)), and altogether have Hy, ~ logn cycles on average. 


and partitions. Formula (30) also sheds a new light on the harmonic number formula for the 
mean number of cycles—each term 1/r in the harmonic number expresses the mean number 
of r—cycles. 

As formulae are so simple, one can extract more information. By (29) one has 


poring 


kir® [=% 
where the last factor counts permutations without cycles of length r. From this (and the asymp- 
totics of generalized derangement numbers in Note IV.9, p. 261), one proves easily that the 
asymptotic law of the number of r—cycles is Poisson! of rate 1 /r; in particular it is not concen- 
trated. (This interesting property to be established in later chapters constitutes the starting point 
of an important study by Shepp and Lloyd [540].) 

Furthermore, the mean number of cycles whose size is between n/2 and n is Hy — Hjn/2); 
a quantity that equals the probability of existence of such a long cycle and is approximately 
log 2 = 0.69314. In other words, we expect a random permutation of size n to have one or a 
few large cycles. (See the article of Shepp and Lloyd [540] for the original discussion of largest 
and smallest cycles.) gse.0cac-Ssa da vbanmavarigetieed dies Jauregehee ba cies sane guewace eure | 


PZ =k} = 


> 


> Ii.10. A hundred prisoners Il. This is the solution to the prisoners problem of Note II.15, 
p. 124 The better strategy goes as follows. Each prisoner will first open the drawer which 
corresponds to his number. If his number is not there, he’ll use the number he just found to 
access another drawer, then find a number there that points him to a third drawer, and so on, 
hoping to return to his original drawer in at most 50 trials. (The last opened drawer will then 
contain his number.) This strategy globally succeeds provided the initial permutation o defined 
by o; (the number contained in drawer /) has all its cycles of length at most 50. The probability 
of the event is 


z 22 750 100 
p= [zl exp( > + 5 +--+ J=1— Do = 4 0.3118278206. 
jos! 


! The Poisson distribution of rate 1 > 0 has the non-negative integers as support and is determined by 


k 
4 


P{k} =e". 
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Figure I1.12. Two random allocations with m = 12, n = 48, corresponding to 
2 = n/m = 4 (left). The right-most diagrams display the bins sorted by decreasing 
order of occupancy. 


Do the prisoners stand a chance against a malicious director who would not place the numbers 
in drawers at random? For instance, the director might organize the numbers in a cyclic per- 
mutation. [Hint: randomize the problem by renumbering the drawers according to a randomly 
chosen permutation. ] dq 


Example 111.10. Allocations, balls-in-bins models, and the Poisson law. Random allocations 
and the balls-in-bins model were introduced in Chapter II in connection with the birthday para- 
dox and the coupon collector problem. Under this model, there are n balls thrown into m bins 
in all possible ways, the total number of allocations being thus m”. By the labelled construction 
of words, the bivariate EGF with z marking the number of balls and u marking the number y (s) 
of bins that contain s balls (s a fixed parameter) is given by 


Ss m 
A = SEQm (SET#5(Z) + uSET=s(Z)) => A(z, u) = (« GS n=) 
Ss! 


In particular, the distribution of the number of empty bins ( yO) is expressible in terms of 
Stirling partition numbers: 


Pn = 8) = tuts" AG, 4) = AO (Y a | 


kk} |m-k 
By differentiating the BGF, we get an exact expression for the mean (any s > 0): 
1 1 1\"S nn—-1)---a—s+1 
(31) —Emn(¢©) = (1 e ) = Z 
m s! m m 


Let m and n tend to infinity in such a way that n/m = 2 is a fixed constant. This regime 
is extremely important in many applications, some of which are listed below. The average pro- 
portion of bins containing s elements is a Bef (x (s )), and from (31), one obtains by straight- 
forward calculations the asymptotic limit estimate, 


; : ()) = pat 
(32) lim —Em,n (x i ) =e —. 
n/m=),n>00 mM s! 
(See Figure III.12 for two simulations corresponding to 2 = 4.) In other words, a Poisson 
formula describes the average proportion of bins of a given size in a large random allocation. 
(Equivalently, the occupancy of a random bin in a random allocation satisfies a Poisson law in 
the limit.) 
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KR exponential BGF (A(z, u)) cumulative GF (Q.(z)) 

; 1 9) 7 = B(z) 
SEQ: ee uB@ A(z)* - B(z) = G— Boy? ~ Boy 
SET: exp (uB(z)) A(z): B(z) = B(z)e2 

1 B(z) 

CYC: log 1—uB@ T- BO’ 


Figure III.13. Exponential GFs relative to the number of components in A = (B). 


The variance of each 7) (with fixed s) is estimated similarly via a second derivative and 
one finds: 


7 , sso 8 jst 
Viale O\~ me EO), BO) = (ea a = 
‘ s! (s—1)! s! s! 


As a consequence, one has the convergence in probability, 


1 a$ 
= yO, A —, 
m ! 
valid for any fixed s > 0. See Example VIII.14, p. 598 for an analysis of the most filled urn. Mf 


> IIl.11. Hashing and random allocations. Random allocations of balls into bins are central 
in the understanding of a class of important algorithms of computer science known as hash- 
ing (378, 537, 538, 598]: given a universe U/ of data, set up a function (called a hashing func- 
tion) h : U¢ — [1..m] and arrange for an array of m bins; an element x € U/ is placed in bin 
number /(x). If the hash function scrambles the data in a way that is suitably (pseudo)uniform, 
then the process of hashing a file of n records (keys, data items) into m bins is adequately mod- 
elled by a random allocation scheme. If 7 = n/m, representing the “load”, is kept reasonably 
bounded (say, 2 < 10), the previous analysis implies that hashing allows for an almost direct 
access to data. (See also Example II.19, p. 146 for a strategy that folds colliding items into a 
table.) <i 


Number of components in abstract labelled schemas. As in the unlabelled uni- 
verse, a general formula gives the distribution of the number of components for the 
basic constructions. 


Proposition III.6. Consider labelled structures and the parameter y equal to the 
number of components in a construction A = &{B}, where Ris one of SEQ, SET CYC. 
The exponential BGF A(z, u) and the exponential GF Q(z) of cumulated values are 
given by the table of Figure HI.13. 

Mean values are then easily recovered, and one finds 

‘ Qn [2"]QA(@) 

“n (x) Sey = Pony? 

An [z"JA(z) 


by the same formula as in the unlabelled case. 
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> Ii.12. r—Components in abstract labelled schemas. The BGF A(z, u) and the cumulative 
EGF Q(z) are given by the following table, 


re 1 1 ; Bz 
1-(8@) + @- 2) Q-B@yY 7! 
SET: exp (20 +(u—1) =) eB). Pr 
Cyc: log : ae : ass 
lis (B@ + u- 1) 4x") (1—B@)) r! 
in the labelled case. dq 


Example III.11. Set partitions. Set partitions S are sets of blocks, themselves non-empty sets 
of elements. The enumeration of set partitions according to the number of blocks is then given 
by 


S=SET(wSETS;(Z)) => S(z,u) = et), 


Since set partitions are otherwise known to be enumerated by the Stirling partition numbers, 
one has the BGF and the vertical EGFs as a corollary, 


n| ee” u(ez~1) | een em 
> [ifr ee i 
n, 


which is consistent with earlier calculations of Chapter II. 
The EGF of cumulated values, Q(z) is then almost a derivative of S(z): 


Q@) = — Ve! = A5@ — se. 
dz 


Thus, the mean number of blocks in a random partition of size n equals 


Qn = Sntl 4 

Sn Sn , 
a quantity directly expressible in terms of Bell numbers. A delicate computation based on 
the asymptotic expansion of the Bell numbers reveals that the expected value and the standard 


deviation are asymptotic to 


n Jn 
logn’ logn’ 


respectively (Chapter VIII, p. 595). Similarly the exponential BGF of the number of blocks of 
size k is 


S = SET(USET=4(Z) +SETz04(Z)) = Sz w) Se THD 
out of which mean and variance can also be derived. ........... 0000 ccc cece eee ee eee || 


Example 111.12. Root degree in Cayley trees. Consider the class T of Cayley trees (non-plane 
labelled trees) and the parameter “root-degree’”’. The basic specifications are 


T 
To 


Z x SET(T) T(z) = ze?) 
=—> 
Z x SET(UT) T(zZ,u) = zelT(), 
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The set construction reflects the non-planar character of Cayley trees and the specification T° is 
enriched by a mark associated to subtrees dangling from the root. Lagrange inversion provides 
the fraction of trees with root degree k, 


1 n! (n — 1)?-2-* ew} 
~ ge eS, 
(k-D!(n—1—b! ntl (k—1)! 


Similarly, the cumulative GF is found to be Q(z) = T(z)’, so that the mean root degree satisfies 


1 
Ez, (root degree) = 2 (: - ~) ~ 2. 
n 


Thus the law of root degree is asymptotically a Poisson law of rate 1, shifted by 1. Probabilistic 
phenomena qualitatively similar to those encountered in plane trees are observed here, since 
the mean root degree is asymptotic to a constant. However a Poisson law eventually reflecting 
the non-planarity condition replaces the modified geometric law (known as a negative binomial 
law) present:in planetrees ys sie ae adh Siete duals MORW beatae irene siaiees gid ameter we | 


> Y1.13. Numbers of components in alignments. Alignments (Q) are sequences of cycles 
(Chapter II, p. 119). The expected number of components in a random alignment of Oy is 


[2] log(1 — z)~!(1 — log(1 — z)7!)~? 
[z"](1 — log(1 — z)~!)7! 


Methods of Chapter V imply that the number of components in a random alignment has expec- 
tation ~ n/(e — 1) and standard deviation ©(./n). dq 


> IIL.14. Image cardinality of a random surjection. The expected cardinality of the image of a 
random surjection in R, (Chapter II, p. 106) is 
[1 O-2)* 
[2"](2—e2)7} 
The number of values whose preimages have cardinality k is obtained upon replacing the factor 


e by zk /k!. By the methods of Chapters IV (p. 259) and V (p. 296), the image cardinality of a 
random surjection has expectation n/(2 log 2) and standard deviation @(./n). dq 


> 1.15. Distinct component sizes in set partitions. Take the number of distinct block sizes 
and cycle sizes in set partitions and permutations. The bivariate EGFs are 


0° 0° 
[] (-w tue"), [] GQ -4 tue"), 


n=1 n=1 


as follows from first principles. dq 


Postscript: Towards a theory of schemas. Let us look back and recapitulate 
some of the information gathered in pages 167-180 regarding the number of compo- 
nents in composite structures. The classes considered in Figure III.14 are composi- 
tions of two constructions, either in the unlabelled or the labelled universe. Each entry 
contains the BGF for the number of components (e.g., cycles in permutations, parts 
in integer partitions, and so on), and the asymptotic orders of the mean and standard 
deviation of the number of components for objects of size n. 

Some obvious facts stand out from the data and call for explanation. First the 
outer construction appears to play the essential réle: outer sequence constructs (com- 
pare integer compositions, surjections and alignments) tend to dictate a number of 
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Unlabelled structures 


Integer partitions, MSET o SEQ Integer compositions, SEQ o SEQ 


exp( u é gue eo , yo 
ie a aoe re 


~ ee @(./n) oe 7 O(/n) 
Labelled structures 

Set partitions, SET o SET Surjections, SEQo SET 

exp (u (e® — 1)) (1—u(e? -1))7! 

n 

aor ~~ ~ 20g 2’ eC” 
Permutations, SET o Cyc Alignments, SEQo CYC 
exp (wlog(1 -2)"') (1 —ulog( -2') 

~logn, ~,/logn at O(/n) 


Figure III.14. Major properties of the number of components in six level-two struc- 
tures. For each class, from top to bottom: (i) specification type; (ii) BGF; (iii) mean 
and standard deviation of the number of components. 


components that is ®(7) on average, while outer set constructs (compare integer par- 
titions, set partitions, and permutations) are associated with a greater variety of asymp- 
totic regimes. Eventually, such facts can be organized into broad analytic schemas, as 
will be seen in Chapters V-IX. 

> III.16. Singularity and probability. The differences in behaviour are to be assigned to the 


rather different types of singularity involved (Chapters [V—-VIII): on the one hand sets corre- 
sponding algebraically to an exp(-) operator induce an exponential blow-up of singularities; on 


the other hand sequences expressed algebraically by quasi-inverses (1 — jo are likely to in- 
duce polar singularities. Recursive structures such as trees lead to yet other types of phenomena 
with a number of components, e.g., the root degree, that is bounded in probability. J 


III. 5. Recursive parameters 


In this section, we adapt the general methodology of previous sections in order to 
treat parameters that are defined by recursive rules over structures that are themselves 
recursively specified. Typical applications concern trees and tree-like structures. 

Regarding the number of leaves, or more generally, the number of nodes of some 
fixed degree, in a tree, the method of placing marks applies, as in the non-recursive 
case. It suffices to distinguish elements of interest and mark them by an auxiliary 
variable. For instance, in order to mark composite objects made of r components, 
where r is an integer and & designates any of SEQ, SET (or MSET, PSET), CYC, one 
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should split a construction &(C) as follows: 
R(C) = uR=-(C) + Ror (C) = (u — 1) K-(C) + RC). 


This technique gives rise to specifications decorated by marks to which Theorems III. 1 
and III.2 apply. For a recursively-defined structure, the outcome is a functional equa- 
tion defining the BGF recursively. The situation is illustrated by Examples HI.13 
and III. 14 below in the case of Catalan trees and the parameter number of leaves. 


Example W113. Leaves in general Catalan trees. How many leaves does a random tree of 
some variety have? Can different varieties of trees be somehow distinguished by the proportion 
of their leaves? Beyond the botany of combinatorics, such considerations are for instance rele- 
vant to the analysis of algorithms since tree leaves, having no descendants, can be stored more 
economically; see [377, Sec. 2.3] for an algorithmic motivation for such questions. 

Consider once more the class G of plane unlabelled trees, G = Z x SEQ(G), enumerated 


by the Catalan numbers: Gy, = Seman The class G° where each leaf is marked is 
zG(z,u 
G° = Zu+ Z x SEQs 1 (G°) => Cir See 
= 1 — G(z, u) 


The induced quadratic equation can be solved explicitly 


Gu) = 5 (14 w= De~ yi -2u + Det w= 12). 


It is however simpler to expand using the Lagrange inversion theorem which yields 


1 n 
wi (en Ge.) = (Fo (u+ 2S) ) 
n fy 


1 fn tyr} yn-k 1 n\ (n—2 
ADO” Vaz aeem a(e)(e-2) 


These numbers are known as Narayana numbers, see EJS A001263, and they surface repeatedly 
in connection with ballot problems. The mean number of leaves is derived from the cumulative 
GF, which is 


Gn,k 


1 4: 1 Zz 

EG EN og PSR Sa 

2 2/1 —4z 

so that the mean is n/2 exactly for n > 2. The distribution is concentrated since the standard 
deviation is easily calculated to be O(./1). 0... ccc ccc cece cece een eee een etn n tenes a 


Q(z) = GZ, w)|ya1 = 


Example I11.14. Leaves and node types in binary trees. The class B of binary plane trees, also 


enumerated by Catalan numbers (By = — C")) can be specified as 
(33) B=Z+(Bx Z)+(Z2xB)4+(6x Zx B), 


which stresses the distinction between four types of nodes: leaves, left branching, right branch- 
ing, and binary. Let uo, u1, uz be variables that mark nodes of degree 0,1,2, respectively. Then 
the root decomposition (33) yields, for the MGF B = B(z, uo, uy, uz), the functional equation 


B=zug+2zuj;B+ zu B?, 


which, by Lagrange inversion, gives 


ak n 
Bn kok ko = ae ko ky ko 2% 
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subject to the natural conditions: ky +k; + kg =n and kg = ky +1. Moments can be easily 
calculated using this approach [499]. In particular, the mean number of nodes of each type is 
asymptotically: 


n n n 
leaves: ~ —, l-nodes: ~ =, 2-nodes: ~ -. 
4 2 4 


There is an equal asymptotic proportion of leaves, double nodes, left branching, and right 
branching nodes. Furthermore, the standard deviation is in each case O(./n), so that all the 
corresponding distributions are concentrated. ......... ccc ee cee eee ee teen eens | 


> II1.17. Leaves and node-degree profile in Cayley trees. For Cayley trees, the bivariate EGF 
with u marking the number of leaves is the solution to 
T(z, u) =uzt z(e! &) — 1). 

(By Lagrange inversion, the distribution is expressible in terms of Stirling partition numbers.) 
The mean number of leaves in a random Cayley tree is asymptotic to ne~ ! More generally, the 
mean number of nodes of outdegree k in a random Cayley tree of size n is asymptotic to 
=f 

ky 


Degrees are thus approximately described by a Poisson law of rate 1. dq 


n:eé 


> WII.18. Node-degree profile in simple varieties of trees. For a family of trees generated 
by T(z) = z(T(z)) with ¢ a power series, the BGF of the number of nodes of degree k 
satisfies 


T(z,u) =2 (TG, w) + ou - DTW), 


where ¢; = [u*]¢(u). The cumulative GF is 


xT (z)* rae oa ee, 
Q(z) = 2 Tr eT (8 T(z); 
© = 27 age THETOTO 
from which expectations can be determined. dq 


> I1.19. Marking in functional graphs. Consider the class ¥ of finite mappings discussed in 
Chapter IT: 


F = SET(K), K =Cyc(7), T = ZxSET(T). 
The translation into EGFs is 


FQ) =eK®, K(z) = log T(z) = ze? ®), 


1 
1-— T(z) > 
Here are the bivariate EGFs for (i) the number of components, (ii) the number of maximal 
trees, (ii) the number of leaves: 
+) UK (z) .. 
eK, mae 


ihe 1 : _ T (z,u) 
(iii) i-TGn) with T(z,u) = (u—1)z+ ze ; 


The trivariate EGF F (uj, v2, z) of functional graphs with uv; marking components and u7 mark- 
ing trees is 


1 
(1 — ugT (z))1 


An explicit expression for the coefficients involves the Stirling cycle numbers. dq 


F(z, uy, uz) = exp(uy log(1 — w2T(z))7!) = 
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We shall now stop supplying examples that could be multiplied ad libitum, since 
such calculations greatly simplify when interpreted in the light of asymptotic analysis, 
as developed in Part B. The phenomena observed asymptotically are, for good reasons, 
especially close to what the classical theory of branching processes provides (see the 
books by Athreya—Ney [21] and Harris [324], as well as our discussion in the context 
of “complete” GFs on p. 196). 


Linear transformations on parameters and path length in trees. We have so 
far been dealing with a parameter defined directly by recursion. Next, we turn to 
other parameters such as path length. As a preamble, one needs a simple linear trans- 
formation on combinatorial parameters. Let A be a class equipped with two scalar 
parameters, y and ¢, related by 


X(a) = |a| + c(a). 


Then, the combinatorial form of BGFs yields 


YS cla Z) =H clelulaEO = SP (eu) ; 


aeA aceA aeA 
that is, 


(34) A,(z,u) = Ag(zu, u). 
This is clearly a general mechanism: 


Linear transformations and MGFs: A linear transformation on parameters induces 
a monomial substitution on the corresponding marking variables in MGFs. 


We now put this mechanism to use in the recursive analysis of path length in trees. 


Example W1.15. Path length in trees. The path length of a tree is defined as the sum of distances 
of all nodes to the root of the tree, where distances are measured by the number of edges on 
the minimal connecting path of a node to the root. Path length is an important characteristic 
of trees. For instance, when a tree is used as a data structure with nodes containing additional 
information, path length represents the total cost of accessing all data items when a search 
is started from the root. For this reason, path length surfaces, under various models, in the 
analysis of algorithms, in particular, in the area of algorithms and data structures for searching 
and sorting (e.g., tree-sort, quicksort, radix-sort [377, 538]). 
The formal definition of path length of a tree is 


(35) A(t) := >) dist(v, root(r)), 


vet 


where the sum is over all nodes of the tree and the distance between two nodes is measured by 
the number of connecting edges. The definition implies an inductive rule 


(36) A(t) = >) Ao) + Io), 
DX<T 
in which v ~ t indicates a summation over all the root subtrees v of t. (To verify the equiva- 
lence of (35) and (36), observe that path length also equals the sum of all subtree sizes.) 
From this point on, we focus the discussion on general Catalan trees (see Note II.20 for 
other cases): G = Z x SEQ(G). Introduce momentarily the parameter w(z) = |t|+ A(z). Then, 
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one has from the inductive definition (36) and the general transformation rule (34): 


(37) HG) = —WEEw and = Gy(z, u) = Ga (zu, w). 
LU > 


In other words, G(z, uv) = Gj(z, u) satisfies a nonlinear functional equation of the difference 
type: 

z 
1 — Guz, u)’ 
(This functional equation will be revisited in connection with area under Dyck paths in Chap- 
ter V, p. 330.) The generating function Q(z) of cumulated values of 1 is then obtained by 
differentiation with respect to u, then setting u = 1. We find in this way that the cumulative GF 
Q(z) := OuG(zZ, u)|,—1 satisfies 


G(z,u) = 


z ! 
Q(z) = ———,, (<G'(z) + Q(z)), 
= Geto noe ) 
which is a linear equation that solves to 
G'(z z z 
aujer (Z) 


(-G@yP2-z 20-42) 2/1 —4z" 


: 2Wn-1)’ 


where the sequence starting 1, 5, 22, 93, 386 forn > 2 constitutes EJS A000346. By elementary 
asymptotic analysis, we get: 


Consequently, one has (n > 1) 


The mean path length of a random Catalan tree of size n is asymptotic to Vv mn; 
in short: a branch from the root to a random node in a random Catalan tree of size n 
has expected length of the order of /n. 
Random Catalan trees thus tend to be somewhat imbalanced—by comparison, a fully balanced 
binary tree has all paths of length at most logy n + O(1). ....... eee eee cece eee eee | 


The imbalance in random Catalan trees is a general phenomenon—it holds for bi- 
nary Catalan and more generally for all simple varieties of trees. Note III.20 below and 
Example VII.9 (p. 461) imply that path length is invariably of order n./n on average 
in such cases. Height is of typical order ./n as shown by Rényi and Szekeres [507], de 
Bruijn, Knuth, and Rice [145], Kolchin [386], as well as Flajolet and Odlyzko [246]: 
see Subsection VII. 10.2, p. 535 for the outline of a proof. Figure III.15 borrowed 
from [538] illustrates this on a simulation. (The contour of the histogram of nodes by 
levels, once normalized, has been proved to converge to the process known as Brow- 
nian excursion.) 
> IIL.20. Path length in simple varieties of trees. The BGF of path length in a variety of trees 
generated by T(z) = z(T (z)) satisfies 

T(Z,u) = zb(T Gu, u)). 


In particular, the cumulative GF is 


Q(z) = oy (TZ 4)) ya = Caro, 


from which coefficients can be extracted. <i 
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Figure I11.15. A random pruned binary tree of size 256 and its associated level pro- 
file: the histogram on the left displays the number of nodes at each level in the tree. 


III. 6. Complete generating functions and discrete models 


By acomplete generating function, we mean, loosely speaking, a generating func- 
tion in a (possibly large, and even infinite in the limit) number of variables that mark 
a homogeneous collection of characteristics of a combinatorial class”. For instance 
one might be interested in the joint distribution of all the different letters composing 
words, the number of cycles of all lengths in permutations, and so on. A complete 
MGEF naturally entails detailed knowledge on the enumerative properties of structures 
to which it is relative. Complete generating functions, given their expressive power, 
also make weighted models amenable to calculation, a situation that covers in particu- 
lar Bernoulli trials (p. 190) and branching processes from classical probability theory 
(p. 196). 


Complete GFs for words. As a basic example, consider the class of all words 

W = SEQ{A} over some finite alphabet A = {aj,...,a,-}. Let y = (11,.--, Xr), 
where y ;(w) is the number of occurrences of the letter a; in word w. The MGF of A 
with respect to y is 

A = uja, + urd) + +++ U;a, => A(z, u) = zu + Zup +--+ 2U,;, 
and y on W is clearly inherited from y on A. Thus, by the sequence rule, one has 

1 

1— zy tug +-+++u,)’ 
which describes all words according to their compositions into letters. In particular, 
the number of words with n; occurrences of letter aj and with n = >’ n; is in this 


(38) W = SEQ(A) = W(z,u) = 


2Complete GFs are not new objects. They are simply an avatar of multivariate GFs. Thus the term is 
only meant to be suggestive of a particular usage of MGFs, and essentially no new theory is needed in order 
to cope with them. 
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framework obtained as 


Pa n n! 
(ual ay a ay = ( - 
Nj1,N2,...,N, Nj .Ng.+++Nyp 
We are back to the usual multinomial coefficients. 


> Iil.21. After Bhaskara Acharya (circa 1150AD). Consider all the numbers formed in decimal 
with digit 1 used once, with digit 2 used twice,..., with digit 9 used nine times. Such numbers 
all have 45 digits. Compute their sum S and discover, much to your amazement that S equals 


45875559600006 1532 19084769286399999999999999954 1244403999938467809 152307 13600000. 


This number has a long run of nines (and further nines are hidden!). Is there a simple explana- 
tion? This exercise is inspired by the Indian mathematician Bhaskara Acharya who discovered 
multinomial coefficients near 1150AD; see [377, pp. 23-24] for a brief historical note. J 


Complete GFs for permutations and set partitions. Consider permutations and 
the various lengths of their cycles. The MGF where u; marks cycles of length k for 


k =1,2,...can be written as an MGF in infinitely many variables: 
2 3 
Zz Zz Zz 
(39) Pew =eo(mi bias tay +). 


This MGF expression has the neat feature that, upon restricting all but a finite number 
of u; to 1, we derive all the particular cases of interest with respect to any finite 
collection of cycles lengths. Observe also that one can calculate in the usual way any 
coefficient [z”]P as it only involves the variables uj, ..., Un. 


> W1.22. The theory of formal power series in infinitely many variables. (This note is for 
formalists.) Mathematically, an object like P in (39) is perfectly well defined. Let U = 
{uw ,u2,...} be an infinite collection of indeterminates. First, the ring of polynomials R = 
C[U] is well defined and a given element of R involves only finitely many indeterminates. 
Then, from R, one can define the ring of formal power series in z, namely R[[z]]. (Note that, 
if f © Riz], then each [z”]f involves only finitely many of the variables u ;.) The basic op- 
erations and the notion of convergence, as described in Appendix A.5: Formal power series, 
p. 730, apply in a standard way. 

For instance, in the case of (39), the complete GF P(z, u) is obtainable as the formal limit 


Zz zk k+1 
PAG Eat ee ee 


in R[[z]] equipped with the formal topology. (In contrast, the quantity evocative of a generating 
function of words over an infinite alphabet 
—1 


’ 00 
W= l-z > uj 
j=l 


cannot be soundly defined as an element of the formal domain R[[z].) <J 


Henceforth, we shall keep in mind that verifications of formal correctness regard- 
ing power series in infinitely many indeterminates are always possible by returning to 
basic definitions. 

Complete generating functions are often surprisingly simple to expand. For in- 
stance, the equivalent form of (39) 


2 3 
P(z,u) = ete/l. gt2"/2, gise/3.... 


188 III, PARAMETERS AND MULTIVARIATE GFS 


implies immediately that the number of permutations with k, cycles of size 1, k2 of 
size 2, and so on, is 
n!} 


40 
Ga ky!ko!--+k,! 1 2k... nkn 


provided >* jk; =n. This is a result originally due to Cauchy. Similarly, the EGF of 
set partitions with uv ; marking the number of blocks of size j is 


2 3 
Zz Zz Zz 
sca) = ean (nj FH , +43 to) 


2 3 


A formula analogous to (40) follows: the number of partitions with k; blocks of size 
1, ko of size 2, and so on, is 
n! 
ky! ko! ---ky! 11 21k2. thn” 
Several examples of such complete generating functions are presented in Comtet’s 
book; see [129], pages 225 and 233. 


> Iil.23. Complete GFs for compositions and surjections. |The complete GFs of integer 
compositions and surjections with wu ; marking the number of components of size j are 


1 1 
1 Dj ez 1- Dea H 


The associated counts with n = >> : jk; are given by 


ky tkot+--:- n!} ky thkhot+-::: 
ky, ko,... DP 1ki2tko...\ ky,ko,... J 
These factored forms follow directly from the multinomial expansion. The symbolic form of 
the multinomial expansion of powers of a generating function is sometimes expressed in terms 


of Bell polynomials, themselves nothing but a rephrasing of the multinomial expansion; see 
Comtet’s book [129, Sec. 3.3] for a fair treatment of such polynomials. 


> WI.24. Fad di Bruno’s formula. The formulae for the successive derivatives of a functional 
composition h(z) = f(g(z)) 


A-h(z) = f’(g(z))g’(z),  O2h(z) = Ff" (g(z))8’(z)? + f’@B" (2), -- 


are clearly equivalent to the expansion of a formal power series composition. Indeed, assume 
without loss of generality that z = 0 and g(0) = 0; set fy := 67 f(0), and similarly for g, h. 


Then 
= Zs _ Sk 82.2 k 
h() = Do lin a ater (eiz+ hs +--+) 4 
n k 


Thus in one direct application of the multinomial expansion, one finds 


ey ay i ay (es (ay" 
n! — ki . £1, €0,...,€;) \1 2! ki}? 


where the summation condition C is: 1€; + 22 +---+k€, =n, €) +éo+---+€h =k. 
This shallow identity is known as Faa di Bruno’s formula [129, p. 137]. (Faa di Bruno (1825-— 
1888) was canonized by the Catholic Church in 1988, presumably for reasons unrelated to his 
formula.) <i 
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> IIL.25. Relations between symmetric functions. Symmetric functions may be manipulated 
by mechanisms that are often reminiscent of the set and multiset construction. They appear in 
many areas of combinatorial enumeration. Let X = {x;}/_, be a collection of formal variables. 
Define the symmetric functions 


l 
He + x}2) = DB I] cae = DiPn2”, > i = = Dei 


i i 


The ay, bn, Cn, called, respectively, elementary, monomial, and power symmetric functions, are 
expressible as 


7 

= en ee Pings anise ty = r 

an = Xi Xin Xi,» n = Xi Xin Xi,o Cn = Xj. 
i=l 


ij <n <++ <i, i) <in<--<i, 


The following relations hold for the OGFs A(z), B(z), C(z) of an, bn, cn: 


1 1 
B(z = — A(z — > 
(z) A a (z) Bia) ; 
: t 
C(z) = z log B(z), Biz) = exp | C(t) —. 
dz 0 t 

Consequently, each of an, by, Cn is polynomially expressible in terms of any of the other quan- 
tities. (The connection coefficients, as in Note III.24, involve multinomials.) <i 


> III.26. Regular graphs. A graph is r—regular iff each node has degree exactly equal to r. The 
number of r—regular graphs of size n is 


[xp x5 ++ xp] I] (1 + xjx;). 
1<i<j<n 


[Gessel [289] has shown how to extract explicit expressions from such huge symmetric func- 
tions; see Appendix B.4: Holonomic functions, p. 748.] <q 


II. 6.1. Word models. The enumeration of words constitutes a rich chapter of 
combinatorial analysis, and complete GFs serve to generalize many results to the case 
of non-uniform letter probabilities, such as the coupon collector problem and the birth- 
day paradox considered in Chapter II. Applications are to be found in classical prob- 
ability theory and statistics [139] (the so-called Bernoulli trial models), as well as in 
computer science [564] and mathematical models of biology [603]. 


Example 111.16. Words and records. Fix an alphabet A = {a,...,a,} and let W = SEQ{A} 
be the class of all words over A, where A is naturally ordered by ay < ay < --- < ay. 
Given a word w = w 1 --- wn, a (strict) record is an element w that is larger than all preceding 
elements: w; > w; foralli < j. (Refer to Figure II.15 of Chapter II for a graphical rendering 
of records in the case of permutations.) 

Consider first the subset of WV comprising all words that have the letters aj,,..., aj, as 
successive records, where ij < --- < iz. The symbolic description of this set is in the form of 
a product of k terms 


(41) (« SEQ(a] +++: +a) Ye («i SEQ(a] + °°: +4,)). 


Consider now MGFs of words where z marks length, v marks the number of records, and each 
uj marks the number of occurrences of letter a i The MGF associated to the subset described 
in (41) is then 


(cous, (1 — zy +--+ + “") ee (cout —Z(uy tee + uy). 
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Summing over all values of k and of i; < --- < ix gives 
r 
(42) W(z,v,u) = [] (1+ cous b= cy +o $us))71), 
sal 


the rationale being that, for arbitrary quantities ys, one has by distributivity: 


r r 
by > Vi Vin Yin = [] + ys). 
s=1 


k=0 1Si, <-+-<ig<r 


We shall encounter more applications of (42) below. For the time being let us simply 
examine the mean number of records in a word of length n over the alphabet A, when all such 
words are taken equally likely. One should set u; +> 1 (the composition into specific letters is 
forgotten), so that W assumes the simpler form 


W(z,v) = I] (1 at =) : 


j=l 


Logarithmic differentiation then gives access to the generating function of cumulated values, 


r 


Zz 1 
ere G-)z 


—1- 
j=l 


rs) 
Q(z) = om W(z, v) 


v=1 


Thus, by partial fraction expansion, the mean number of records in Wy, (whose cardinality is r”) 
has the exact value 
r-1 
(43) Eyy, @# records) = H, — >> 
j=l 


There appears the harmonic number H,, as in the permutation case, but now with a negative 
correction term which, for fixed r, vanishes exponentially with n. ..................20008 | 


Example 11.17. | Weighted word models and Bernoulli trials. Let A = {a,,..., ar} be an 
alphabet of cardinality r, and let A = {A1,...,A,} be a system of numbers called weights, 
where weight A; is viewed as attached to letter aj. Weights may be extended from letters to 
words multiplicatively by defining the weight z(w) of word w as 


mw) = Ajj din- ++ Ai, if w=dj,dj,--- dj, 
r 
= U a ", 
j=l 


where y ;(w) is the number of occurrences of letter a; in w. Finally, the weight of a set is by 
definition the sum of the weights of its elements. 

Combinatorially, weights of sets are immediately obtained once the corresponding gener- 
ating function is known. Indeed, let S C W = SEQ{.A} have the complete GF 


S(z, Uy, ...,Ur) = >> zlial yx) . utr) 
wes 


where y ;(w) is the number of occurrences of letter a; in w. Then one has 


S@iAiss 104) => zie), 


wes 


III.6. COMPLETE GENERATING FUNCTIONS AND DISCRETE MODELS 191 


so that extracting the coefficient of z” gives the total weight of S, = SMW, under the weight 
system A. In other words, the GF of a weighted set is obtained by substitution of the numerical 
values of the weights inside the associated complete MGF- 

In probability theory, Bernoulli trials refer to sequences of independent draws from a fixed 
distribution with finitely many possible values. One may think of the succession of flippings of 
a coin or castings of a die. If any trial has r possible outcomes, then the various possibilities can 
be described by letters of the r—ary alphabet A. If the probability of the jth outcome is taken to 
be / ;, then the A-weighted models on words becomes the usual probabilistic model of indepen- 
dent isle (In this situation, the 4; are often written as p;.) Observe that, in the probabilistic 
situation, one must have A; +- ae Ay = 1 with each 4; satisfying 0 < A; < 1. The equiproba- 
ble case, where each outcome has probability 1/r can te obtained by setting 2; = 1/r, leaving 
us with the usual enumerative model. In terms of GFs, the coefficient "is (z,44,---54r) 
then represents the probability that a random word of W, belongs to S. Multivariate gener- 
ating functions and cumulative generating functions then obey properties similar to their usual 
(ordinary, exponential) counterparts. 

As an illustration, assume one has a biased coin with probability p for heads (H) and g = 
1 — p for tails (7). Consider the event: “in n tosses of the coin, there never appear € contiguous 
heads”. The alphabet is A = {H, T}. The construction describing the events of interest is, as 
seen in Subsection I. 4.1 (p. 51), 


S = SEQ<¢{H} SEQ{T SEQ <¢{H}}. 


Its GF, with uw marking heads and v marking tails, is then 


l-—zu 


Thus, the probability of the absence of f-runs among a sequence of n random coin tosses is 
obtained after the substitution u > p, v > q in the MGF, 


; Lope 
eh pe 

leading to an expression which is amenable to numerical or asymptotic analysis. For instance, 

Feller’s book [206, p. 322-326] offers a classical discussion of the problem. .............. | 


Example Y1.18. = Records in Bernoulli trials. We pursue the discussion of probabilistic 
models on words and come back to the analysis of records. Assume now that the alphabet 
A = {a,,...,4,} has in all generality the probability p; associated with the letter aj. The 
mean number of records is analysed by a process entirely parallel to the derivation of (43): one 
finds by logarithmic differentiation of (42) 


r 


n z Pj 
44) Ew # ds) = Q whi Q(z) = > . 
CD VME Tees) = eee) eh z 4 1l-2tpi +++ + pj-1) 


The cumulative GF ©(z) in (44) has simple poles at the points 1, 1/P,_1, 1/P,—2, and so on, 
where Ps = py +---+ ps. For asymptotic purposes, only the dominant pole at z = 1 counts 
(see Chapter IV for a systematic discussion), near which 


r 


Oo. 1 > i. 


zoll—-—z+¢ 


P iii 


Consequently, one has an elegant asymptotic Pate generalizing the case of permutations: 
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The mean number of records in a random word of length n with non-uniform letter 
probabilities pj; satisfies asymptotically (n + +00) 
r 


Eyy, (# records) ~ >, 
j=l 


Pj 
Pj + Pjtit-:++ Pr 


This relation and similar ones were obtained by Burge [97]; analogous ideas may serve to ana- 
lyse the sorting algorithm Quicksort under equal keys [536] as well as the hybrid data structures 
of Bentley and Sedgewick; see [47, 124]. 2.0... cece cee n cece enn aes | 


Coupon collector problem and birthday paradox. Similar considerations apply 
to weighted EGFs of words, as considered in Chapter II. For instance, the proba- 
bility of having a complete coupon collection at time n in the case a company issues 
coupon j with probability p;, for 1 < 7 <r, is (coupon collector problem, p. 114) 

r 
P(C <n) =nl[z"] I] (cP? — 1). 
j=l 
The probability that all coupons are different at time n is (birthday paradox, p. 114) 
r 
P(B > n) =nllz") [| (1+ viz), 
j=l 
which corresponds to the birthday problem in the case of non-uniform mating periods. 
Integral representations comparable those of Chapter II are also available: 


r 


Bc) = [ 1—[]a-e7?) | ar, (8) = | [] G+ pjt)e at. 


j=l j=l 


See the study by Flajolet, Gardy, and Thimonier [231] for variations on this theme. 


> IIL.27. Birthday paradox with leap years. Assume that the 29th of February exists precisely 
once every fourth year. Estimate the effect on the expectation of the first birthday collision. < 


Example 111.19. — Rises in Bernoulli trials: Simon Newcomb’s problem. Simon Newcomb 
(1835-1909), otherwise famous for his astronomical work, was reportedly fond of playing the 
following patience game: one draws from a deck of 52 playing cards, stacking them in piles in 
such a way that one new pile is started each time a card appears whose number is smaller than 
its predecessor. What is the probability of obtaining rf piles? A solution to this famous problem 
is found in MacMahon’s book [428] and a concise account by Andrews appears in [14, §4.4]. 

Simon Newcomb’s problem can be rephrased in terms of rises. Given a word w = 
W1 +++ Wp over the alphabet A ordered by aj < az < ---, a weak rise is a position j <n 
such that w; < wj41. (The numbers of piles in Newcomb’s problem is the number of cards 
minus | minus the number of weak rises.) Let W = W(z, v, u) be the MGF of all words where 
z marks length, v marks the number of weak rises, and u j marks the number of occurrences of 
letter j. Set z; = zu; and let W; = W;(z, v, u) be the MGF relative to those non-empty words 
that start with letter aj, So that 


W=14+(W, +--+: + W,). 
The W; satisfy the set of equations (j = 1,...,7r), 
(45) Wy = cp +zj(Wit--- + Wj_1) +0zj (Wj +--+ Wr), 
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as seen by considering the first letter of each word. The linear system (45) is easily solved upon 
setting W; = z; Xj. Indeed, by differencing, one finds that 
(46) Xjq1— Xj =zjXj(L—v), Xjq1 = XjU+zjU—09)). 
In this way, each X ; can be determined in terms of X;. Then transporting the resulting expres- 
sions into the relation (45) taken with j = 1, and solving for X1 leads to an expression for X1, 
hence for all the X ; and finally for W itself: 

o—-1 é 
47 W = —__, P:= 14+ (1— d)z;). 
(47) a [[a+a-o9) 

j=l 

Goulden and Jackson obtain a similar expressions in [303] (pp. 72 and 236). 

The result of (47) gives access to moments (e.g., mean and variance) of the number of 
rises in a Bernoulli sequence as well as to counting results, once coefficients of the MGF are 
extracted. (See also [289, 303] for an approach based on the theory of symmetric functions.) 
The OGF (47) can alternatively be derived by an inclusion—exclusion argument: refer to the 
particular case of rises in permutations and Eulerian numbers, p. 210. ....................- | 


> IiL.28. The final solution to Simon Newcomb’s problem. Consider a deck of cards with a suits 
and r distinct card values. Set N = ra. (The original problem has r = 13, a = 4, N = 52.) 


One has from (47): W = (v — 1)P/(1 — oP). The expansion of (1 — y)~! and the collection 
of coefficients yields 


k r 
k-1 k N+1 k-1 
[ef --- a ]W=(—-0) > D [z+ Zp ]P = (1-0) + > (‘) v ‘ 


k>1 k>1 


t+1 4 
igf NL \ fk 
t = t+1—k 
so that [z{ +--+ zo We eee Cae 7 <J 


III. 6.2. Tree models. We examine here two important GFs associated with tree 
models; these provide valuable information concerning the degree profile and the level 
profile of trees, while being tightly coupled with an important class of stochastic pro- 
cesses, namely branching processes. 

The major classes of trees that we have encountered so far are the unlabelled 
plane trees and the labelled non-plane trees, prototypes being general Catalan trees 
(Chapter I) and Cayley trees (Chapter II). In both cases, the counting GFs satisfy a 
relation of the form 


(48) Y(z) = z(Y(z)), 


where the GF is either ordinary (plane unlabelled trees) or exponential (non-plane 
labelled trees). Corresponding to the two cases, the function ¢ is determined, respec- 
tively, by 


we 
(49) (w= Di v®, gw) = DI, 
acEQ ac6Q 
where © C N is the set of allowed node degrees. Meir and Moon in an important 
paper [435] have described some common properties of tree families that are deter- 
mined by the Axiom (48). (For instance mean path length is invariably of order n/n, 
see Chapter VII, and height is O(./n).) Following these authors, we call a simple 
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variety of trees any class whose counting GF is defined by an equation of type (48). 
For each of the two cases of (49), we write 


(50) ¢(w) = >° bjw!. 


j=0 


Degree profile of trees. First we examine the degree profile of trees. Such a 
profile is determined by the collection of parameters y;, where y ; (7) is the number of 
nodes of outdegree j in t. The variable u; will be used to mark y;, that is, nodes of 
outdegree j. The discussion already conducted regarding recursive parameters shows 
that the GF Y(z, u) satisfies the equation 


Y(z,u) = z®(¥ (z, u)) where ®(w) = ugdp + uidpw + urdrw~ fee, 


Formal Lagrange inversion can then be applied to Y (z, w), to the effect that its coeffi- 
cients are given by the coefficients of the powers of ®. 


Proposition III.7 (Degree profile of trees). The number of trees of size n and degree 
profile (no, n1,n2,...) ina simple variety of trees defined by the “generator” (50) is 


1 n Nn n n 
(51) Ynzng,ny,n2,-.. = Wn * (no. m nz, )008 pres . 
There, @, = 1 in the unlabelled case, whereas @, = n! in the labelled case. The 


values of the nj are assumed to satisfy the two consistency conditions: Di nj=n 
and >); jnj=n—l. 
Proof. The consistency conditions translate the fact that the total number of nodes 


should be 1 while the total number of edges should equal n — 1 (each node of degree j 
is the originator of 7 edges). The result follows from Lagrange inversion 


1 
no. ny on -| 
Yaing,ni,no,... = Wn * [veg my 5" see] (<0 Jou") > 


to which a standard multinomial expansion applies, yielding (51). 
For instance, for general Catalan trees (¢; = 1) and for Cayley trees (6; = 1/7!) 
these formulae become 


1 n (n — 1)! n 
= and oan . 
n\no,n1,N2,... O!70 11712172... \no,ny1,n2,... 

|_| 


The proof above also reveals the logical equivalence between the general tree 
counting result of Proposition III.7 and the most general case of Lagrange inversion. 
(This equivalence is due to the fact that any fixed series is a special case of ®.) Put 
another way, any direct proof of (51) provides a combinatorial proof of the Lagrange 
inversion theorem. Such direct derivations have been proposed by Raney [503] and 
are based on simple but cunning surgery performed on lattice path representations of 
trees (the “conjugation principle” of which a particular case is the “cycle lemma” of 
Dvoretzky—Motzkin [184]; see Note 1.47, p. 75). 
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Level profile of trees. The next example demonstrates the utility of complete GFs 
for investigating the level profile of trees. 


Example III.20. Trees and level profile. Given a rooted tree Tt, its level profile is defined as the 
vector (ng, 71,"2,...) where nj; is the number of nodes present at level j (i.e., at distance j 
from the root) in tree t. Continuing within the framework of a simple variety of trees, we now 
define the quantity ¥y:n9,2,,... to be the number of trees with size n and level profile given by 
the n;. The corresponding complete GF Y(z, u) with z marking size and u ; marking nodes at 
level j is expressible in terms of the fundamental “generator” ¢: 


(52) Y(z, u) = zug (cud (curd (cu3z(---)))). 


We may call this a “continued ¢-form”. For instance, general Catalan trees have generator 
ge(w) = (1- wy ts so that in this case the complete GF is the continued fraction: 


uUgz 
(53) ¥(z,u) = z 
uyZ 


(See Section V.4, p. 318, for complementary aspects.) In contrast, Cayley trees are generated 
by d(w) = e”, so that 


ZUZ€° ; 
zuze 
Y(z, u) = zuge@“1€ ; 
which is a “continued exponential”; that is, a tower of exponentials. Expanding such generating 
functions with respect to ug, uj, ..., in order gives the following proposition straightforwardly. 


Proposition III.8 (Level profile of trees). The number of trees of sizen, having (ng, n1,N2, ..-) 
as level profile, in a simple variety of trees with generator $(w) is 


Yning,ny,n25-.- = On-1° pli) ger) gr) enang where (es) = [w" ]b(w)". 


There, the consistency conditions are ng = 1 and >) jij =n. In particular, the counts for 
general Catalan trees and for Cayley trees are, respectively, 


notny —1\fny+n2—-1\f(n2+73-1 (n—1)! ied i 
ny n2 n3 no!nyinz!.-- 9 12 


(Note that one must always have no = 1 for a single tree; the general formula with no 4 1 
and @,—1 replaced by @p—ng gives the level profile of forests.) The first of these enumerative 
results is due to Flajolet [214] and it places itself within a general combinatorial theory of 
continued fractions (Section V.4, p. 318); the second one is due to Rényi and Szekeres [507] , 
who developed such a formula in the course of a deep study relative to the distribution of height 
in random Cayley trees (Chapter VII, p. 537). 2.0... .. cece cece cece nee tence eee | 


> TIL.29. Continued forms for path length. The BGF of path length is obtained from the level 
profile MGF by means of the substitution u ; +» q/. For general Catalan trees and Cayley trees, 
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this gives 


2 

va e 

(54) G(,q)=——,_ Tq) = ede“ 
(pe 


2 
z 
pa 


where q marks path length. The MGFs are ordinary and exponential. (Combined with differen- 
tiation, such MGFs represent an attractive option for mean value analysis.) dq 


Trees and processes. The next example is an especially important application of 
complete GFs, as these GFs provide a bridge between combinatorial models and a 
major class of stochastic processes, the branching processes of probability theory. 


Example 111.21. Weighted tree models and branching processes. Consider the family G of all 
general plane trees. Let A = (Jo, 41,...) be a system of numeric weights. The weight of a 
node of outdegree j is taken to be 2; and the weight of a tree is the product of the individual 
weights of its nodes: 


CO 
= sXj(t) 
(55) x(t) = |] tans 
j=0 
with x ;(z) the number of nodes of degree j in tr. One can view the weighted model of trees as 
a model in which a tree receives a probability proportional to z(r). Precisely, the probability 


of selecting a particular tree t under this model is, for a fixed size n, 
a(t) 
DLytj=n 27) 
This defines a probability measure over the set G, and one can consider events and random 
variables under this weighted model. 

The weighted model defined by (55) and (56) covers any simple variety of trees: just 
replace each 4; by the quantity 4; given by the “generator’ (50) of the model. For instance, 
plane unlabelled unary—binary trees are obtained by A = (1, 1, 1,0, 0,...), while Cayley trees 
correspond to 4; = 1/j!. Two equivalence-preserving transformations are then especially 
important in this context: 


(56) Pg, a(t) = 


(i) Let A* be defined by a = cA; for some non-zero constant c. Then the weight cor- 
responding to A* satisfies x*(r) = c!*!x(r). Consequently, the models associated 
to A and A* are equivalent as regards (56). 

(ii) Let A° be defined by Aj = 0/1; for some non-zero constant 6. Then the weight 
corresponding to A° satisfies 1°(r) = 6!7!-!7(z), since >; iXj() = |t| — 1 for 

any tree t. Thus the models A° and A are again equivalent. 


Each transformation has a simple effect on the generator ¢, namely: 
(57) o(w) + $*(w) = ch(w) and d(w) 6°(w) = dw). 

Once equipped with such equivalence transformations, it becomes possible to describe 
probabilistically the process that generates trees according to a weighted model. Assume that 
4; = Oand that the A; are summable. Then the normalized quantities 

Pj = ee 
oe 
dj Aj 
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form a probability distribution over N. By the first equivalence-preserving transformation the 
model induced by the weights pj; is the same as the original model induced by the 4;. (By 
the second equivalence transformation, one can furthermore assume that the generator ¢ is the 
probability generating function of the pj.) 

Such a model defined by non-negative weights {pj} summing to | is nothing but the clas- 
sical model of branching processes (also known as Galton—Watson processes); see [21, 324]. In 
effect, a realization T of the branching process is classically defined by the two rules: (i) pro- 
duce a root node of degree j with probability p;; (ii) if j > 1, attach to the root node a 
collection 7;,..., 7; of independent realizations of the process. This may be viewed as the 
development of a “family” stemming from a common ancestor where any individual has prob- 
ability p; of giving birth to j children. Clearly, the probability of obtaining a particular finite 
tree t has probability z(t), where z is given by (55) and the weights are 24; = pj. The 
generator 


CO 
d(w) = oF Pj w/ 
j=0 
is then nothing but the probability generating function of (one-generation) offspring, with the 
quantity w = $’(1) being its mean size. 
For the record, we recall that branching processes can be classified into three categories 
depending on the values of “. 


Subcriticality: when yw < 1, the random tree produced is finite with probability 1 
and its expected size is also finite. 

Criticality: when « = 1, the random tree produced is finite with probability 1 but its 
expected size is infinite. 

Supercriticality: when u > 1, the random tree produced is finite with probability 
strictly less than 1. 


From the discussion of equivalence transformations (57), it is furthermore true that, regarding 
trees of a fixed size n, there is complete equivalence between all branching processes with 
generators of the form 


Fates POw) 
o(@) 

Such families of related functions are known as “exponential families” in probability theory. In 
this way, one may always regard at will the random tree produced by a weighted model of some 
fixed size n as originating from a branching process (of subcritical, critical, or supercritical 
type) conditioned upon the size of the total progeny. 

Finally, take a set S C G for which the complete generating function of S with respect to 
the degree profile is available, 


S(z,ug,Uj,-.) = SS zt (eu 7 -) . 
TES 


Then, for a system of weights A, one has 


S(z, Ao, 41...) =D) a(e)zl#l. 
tES 
Thus, we can find the probability that a weighted tree of size n belongs to S, by extracting 
the coefficient of z”. This applies a fortiori to branching processes as well. In summary, the 
analysis of parameters of trees of size n under either weighted models or branching process 
models follows from substituting weights or probability values in the corresponding complete 
RENCTALNG FUNCHONS: iced ears Packie Bava cate GR OEE SYS Si eee ah ee BN Se Go sin RE ae |_| 
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The reduction of combinatorial tree models to branching processes was pursued 
early, most notably by the “Russian School”: see especially the books by Kolchin 
[386, 387] and references therein. (For asymptotic purposes, the equivalence between 
combinatorial models and critical branching processes often turns out to be most fruit- 
ful.) Conversely, symbolic-combinatorial methods may be viewed as a systematic way 
of obtaining equations relative to characteristics of branching processes. We do not 
elaborate further along these lines as this would take us outside of the scope of the 
present book. 
> II1.30. Catalan trees, Cayley trees, and branching processes. Catalan trees of size n are 
defined by the weighted model in which 4; = 1, but also equivalently by Z j= cOJ, for 
any c > Oand@ < 1. In particular they coincide with the random tree produced by the critical 
branching process whose offspring probabilities are geometric: pj = 1 /2) +1 

Cayley trees are a priori defined by 1; = 1/j!. They can be generated by the critical 
branching process with Poisson probabilities, pj; = eT! /j!, and more generally with an arbi- 
trary Poisson distribution pj = eas /j!. dq 


Ill. 7. Additional constructions 


We discuss here additional constructions already examined in earlier chapters; 
namely pointing and substitution (Section III. 7.1), order constraints (Section III. 7.2), 
and implicit structures (Section III. 7.3). Given that basic translation mechanisms can 
be directly adapted to the multivariate realm, such extensions involve basically no 
new concept, and the methods of Chapters I and II can be easily recycled. In Sec- 
tion III. 7.4, we revisit the classical principle of inclusion—exclusion under a generat- 
ing function perspective. In this light, the principle appears as a typically multivariate 
device well suited to enumerating objects according the number of occurrences of 
subconfigurations. 


Ill. 7.1. Pointing and substitution. Let (7, v) be a class—parameter pair, where 
x is multivariate of dimension r > 1, and let F(z) be the MGF associated to it in 
the notations of (19) and (28). In particular z9 = z marks size, and zx marks the 
component k of the multiparameter vy. If z marks size, then, as in the univariate 
case, 0, = z0; translates the fact of distinguishing one atom. Generally, pick up a 
variable x = zj for some j withO < j <r. Then since 

x0x(s@t?xt) = f - (s4t?xS), 
the interpretation of the operator 0, = x0, is immediate; it means “pick up in all 
possible ways in objects of F a configuration marked by x and point to it”. For 
instance, if F(z,u) is the BGF of trees where z marks size and u marks leaves, 
then 6, F(z, u) = ud, F(z, wu) enumerates trees with one distinguished leaf. 

Similarly, the substitution x +> S(z) in a GF F, where S(z) is the MGF of a 
class S, means attaching an object of type S to configurations marked by the vari- 
able x in ¥. The process is better understood by practice than by long formal devel- 
opments. Justification in each particular case can be easily obtained by returning to 
the combinatorial representation of generating functions as images of combinatorial 
classes. 
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Figure I1.16. The technique of “adding a slice” for constrained compositions. 


Example WW1.22. Constrained integer compositions and “slicing”. This example illustrates 
variations around the substitution scheme. Consider compositions of integers where successive 
summands have sizes that are constrained to belong to a fixed set R C N2. For instance, the 
relations 
Ri ={@, y)| 1 <x < y}, R2 = {(x,y)| 1 < y < 2x}, 

correspond to weakly increasing summands in the case of 7; and to summands that can at most 
double at each stage in the case of 72. In the “ragged landscape” representation of composi- 
tions, this means considering diagrams of unit cells aligned in columns along the horizontal 
axis, with successive columns obeying the constraint imposed by R. 

Let F(z, u) be the BGF of such R-restricted compositions, where z marks total sum and u 
marks the value of the last summand; that is, the height of the last column. The function F(z, uw) 
satisfies a functional equation of the form 


(58) F(z,u) = fuy+ LIF, u)Dusca-> 


where f(z) is the generating function of the one-column objects and CL is a linear operator over 
formal series in u given by 


(59) hats Se a 
G.KHeER 


In effect, Equation (58) describes inductively objects as comprising either one column (f (zw)) 
or else as being formed by adding a new column to an existing one; see Figure III.16. The 
process of appending a slice of size j to one of size k, with (j,k) € R, is precisely what (59) 
expresses; the functional equation (58) is obtained by effecting the final substitution u H zu, 
in order to take into account the k atoms contributed by the new slice. The special case F(z, 1) 
gives the enumeration of F—-objects irrespective of the size of the last column. 

For a rule that is “simple”, the basic equation (58) will often involve a substitution. Let 
us first rederive in this way the enumeration of partitions. We take = 7, and assume that 
the first column can have any positive size. Compositions into increasing summands are clearly 
the same as partitions. Since 


us 


Laud tulth pul poo , 
—u 


the function F(z, u) satisfies a functional equation involving a substitution, 


(60) F(z,u) = — 


F(z, , 
1-— zu 1-—zu (2H) 


This relation iterates: any linear functional equation of the substitution type 


btu) = alu) + Bw) do) 


200 III, PARAMETERS AND MULTIVARIATE GFS 


is solved formally by 
(61) p(u) = a(u) + Bw)a(a(u)) + BW)B(o W))a(a"?) (wy) ++, 


where o ‘J? (u) designates the jth iterate of u. 
We can now return to partitions. The turnkey solution (61) gives, upon iterating on the 
second argument and treating the first argument as a parameter, 


ZU cu zu 


— Zu bs (1 — zu)(1 — z2u) v (1 — zu)( — z2u)(1 — z3u) oe 


Equivalence with the alternative form 


(62) Fu) = i 


Zu zu Bu 


<3" G2pd=2. G=nd= ase) 


is then easily verified from (60) by expanding F(z, u) asa series in u and applying the method of 
indeterminate coefficients to the form (1—zu) F(z, u) = zu+ F(z, zu). (The representation (63) 
is furthermore consistent with the treatment of partitions given in Chapter I since the quantity 
[uk |F (z, u) clearly represents the OGF of non-empty partitions whose largest summand is k. In 
passing, the equality between (62) and (63) is a shallow but curious identity that is quite typical 
of the area of g—analogues.) 


(63) F(@,u) = i 


This same method has been applied in [250] to compositions satisfying condition 
above. In this case, successive summands are allowed to double at most at each stage. The 
associated linear operator is 

‘ F 1-u2J 
Llu] sut--- +07) =u——_. 
l-u 
For simplicity, it is assumed that the first column has size 1. Thus, F satisfies a functional 
equation of the substitution type: 


Zu 
1-—zu 


F(Z, u) = zu+ (Fe 1) — F(z, *u?)) . 
This can be solved by means of the general iteration mechanism (61), treating for the moment 
F(z, 1) as a known quantity: with a(u) := zu + F(z, 1)/(. — zu), one has 


Zu Zur 


1—zu 1 —z2u? 


Zu 
1l-—zu 


F(z, u) = a(u) — a(z*u’) + a(zou') ee 
Then, the substitution wu = 1 in the solution becomes permissible. Upon solving for F(z, 1), 


one eventually gets the somewhat curious GF for compositions satisfying R: 


(pir ttt i279 5 
Fil) = Sei) 2 aoe /0j-1(2) 
ve Y joo 2"-7-2/9;,@ | 
where Q ; (z) =(-z)- BVA - z’) ahs 2i-ly, 


The sequence of coefficients starts as 1, 1,2, 3,5,9, 16, 28,50 and is EJS A002572: it rep- 
resents, for instance, the number of possible level profiles of binary trees, or equivalently the 
number of partitions of 1 into summands of the form 1, be 1, ? ... (this is related to the number 
of solutions to Kraft’s inequality). See [250] for details, including precise asymptotic estimates, 


and Tangora’s paper [571] for relations to algebraic topology. ..................0 eee eee | 
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The reason for presenting the slicing method? in some detail is that it is very 
general. It has been particularly employed to derive a number of original enumerations 
of polyominoes by area, a topic of interest in some branches of statistical mechanics: 
for instance, the book by Janse van Rensburg [592] discusses many applications of 
such lattice models to polymers and vesicles. Bousquet-Mélou’s review paper [82] 
offers a methodological perspective. Some of the origins of the method point to Pélya 
in the 1930s, see [490], and independently to Temperley [574, pp. 65-67]. 


> WL31. Pointing—erasing and the combinatorics of Taylor’s formula. The derivative oper- 
ator 0, corresponds combinatorially to a “pointing—erasing” operation: select in all possible 
ways an atom marked by x and make it transparent to x-marking (e.g., by replacing it by a 


neutral object). The operator po f(x), then corresponds to picking up in all possible way a 
subset (order does not count) of k configurations marked by x. The identity (Taylor’s formula) 


1 
fe+y=>d. (2109) yk 


k>0 


can then receive a simple combinatorial interpretation: Given a population of individuals (F 
enumerated by f), form the bicoloured population of individuals enumerated by f(x + y), 
where each atom of each object can be repainted either in x-colour or y-colour; the process is 
equivalent to deciding a priori for each individual to repaint k of its atoms from x to y, this for 
all possible values of k > 0. Conclusion: seen from combinatorics, Taylor’s formula merely 
expresses the logical equivalence between two ways of counting. dq 
> II1.32. Carlitz compositions I. Let K be the class of compositions such that all pairs of 
adjacent summands are formed of distinct values. These can be generated by the operator 
Liu] = ia — u/z/, so that LI f(u)] = care ney) — f(uz). The BGF K(z, u), with u 
marking the value of the last summand, then satisfies a functional equation, 
uz Fe uz 

l-—uz l-uz 


Kz, u) = Kz, 1)— Kz, zu), 


giving eventually K (z) = K (z, 1) under the form 
=] 


K(z) 


ll 
= 
of 
armed ae, | 
1 | 
Ra 
ae ace 


(65) I 


L4+z4+27 4323 +424 472° + 1426 +2327 43928 +... 


The sequence of coefficients constitutes E7S A003242. Such compositions were introduced by 
Carlitz in 1976; the derivation above is from a paper by Knopfmacher and Prodinger [369] 
who provide early references and asymptotic properties. (We resume this thread in Note II.35, 
p. 206, then in Chapter IV, p. 263, with regard to asymptotics.) dq 


Il. 7.2. Order constraints. We refer in this subsection to the discussion of or- 
der constraints in labelled products that has been given in Subsection II. 6.3 (p. 139). 
We recall that the modified labelled product 


A= (B~ xC) 


only includes the elements of (B « C) such that the minimal label lies in the A com- 
ponent. Once more the univariate rules generalize verbatim for parameters that are 


3For other applications, see Examples V.20, p. 365 (horizontally convex polyominoes) and IX.14, 
p. 660 (parallelogram polyominoes), as well as Subsection VII. 8.1, p. 506 (walks and the kernel method). 


202 III, PARAMETERS AND MULTIVARIATE GFS 


peak: Oj-1| <9; > oj41 | leaf node (uo) 
double rise: o;_1 < oj < oj, | unary right-branching (v1) 


double fall: oj) > oj > oj41 | unary left-branching (u)) 


valley: Oj-| > Oj < o;41 | binary node (uw) 


Figure III.17. Local order patterns in a permutation and the four types of nodes in 
the corresponding increasing binary tree. 


inherited and the corresponding exponential MGFs are related by 


A(z, u) = [ (6, B(t, u)) - C(t, u) dt. 


To illustrate this multivariate extension, we shall consider a quadrivariate statistic on 
permutations. 


Example YII.23. Local order patterns in permutations. — An element o; of a permutation 
written o = o1,...,0, when compared to its immediate neighbours can be categorized into 
one of four types* summarized in the first two columns of Figure III.17. The correspondence 
with binary increasing trees described in Example II. 17 and Figure II.16 (p. 143) then shows the 
following: peaks and valleys correspond to leaves and binary nodes, respectively, while double 
rises and double falls are associated with right-branching and left-branching unary nodes. Con- 
sider the class Z of non-empty increasing binary trees (so that T=T1 \ {e} in the notations of 
p. 143) and let uo, v1, ui, uz be markers for the number of nodes of each type, as summarized 
in Figure III.17. Then the exponential MGF of non-empty increasing trees under this statistic is 
given by 


A 


T= ugZ + (20 xT) +u) Ex Z7) +ugE x Z7 xT) 


~~ z ~~ ~~ 
=> I(z)=u9z + (G1 +u,)I(w) + urT(w)) dw, 
0 


which gives rise to the differential equation: 
0x ~ a 
al w) = uo + (ur +47 w) + url, w)?. 


This is solved by separation of variables as 
0 vy + Otan(zd) vy 
uy d— vd, tan(zd) un” 


(66) T(z, u) = 


where the following abbreviations are used: 


1 
op = 1 +), 6 = y/uguy — v¢. 


One finds 
2 3 
7 1% 1y2 z 
P= ugz + up(uy + 4) 5 t+ vol + 4) + 2ugur) a to 
“Here, for |c| = n, we regard o as bordered by (—0o, —00), i.e., we set 09 = On41 = —00 and let 


the index i in Figure [1.17 vary in [1 ..]. Alternative bordering conventions prove occasionally useful. 
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Figure III.18. The level profile of a random increasing binary tree of size 256. 
(Compare with Figure III.15, p. 186, for binary trees drawn under the uniform Catalan 
statistics.) 


which agrees with the small cases. This calculation is consistent with what has been found in 
Chapter II regarding the EGF of all non-empty permutations and of alternating permutations, 


—<_. tan(z), 
l-z 


that follow from the substitutions {ug = uy) = ui =u = l}and {uy =u. = 1, u, = us = 0}, 


respectively. The substitution {ug = uj = u, us = Uz = 1} gives a simple variant (without the 
empty permutation) of the BGF of Eulerian numbers (75) on p. 209. 

From the quadrivariate GF, there results that, in a tree of size n the mean number of nodes 
of nullary, unary, or binary type is asymptotic to n/3, with a variance that is O(n), thereby 
ensuring concentration of distribution. .......... 0... cece cece eect een e ne ene eee | 


A similar analysis yields path length. It is found that a random increasing binary 
tree of size n has mean path length 


2n logn + O(n). 


Contrary to what the uniform combinatorial model gives, such trees tend to be rather 
well balanced, and a typical branch is only about 38.6% longer than in a perfect binary 
tree (since 2/log2 = 1.386): see Figure II.18 for an illustration. This fact applies 
to binary search trees (Note II.33) and it justifies the fact that the performance of 
such trees is quite good, when they are applied to random data [378, 429, 538] or 
subjected to randomization [451, 520]. See Subsection VI. 10.3 (p. 427) dedicated 
to tree recurrences for a general analysis of additive functionals on such trees and 
Example IX.28, p. 684, for a distributional analysis of depth. 


> IIL.33. Binary search trees (BSTs). | Given a permutation rt, one defines inductively a tree 
BST(t) by 

BST(€) = 9; BST(t) = (11, BST(t| <r, ), BST(T|>7,)). 
(Here, t|p represents the subword of t consisting of those elements that satisfy predicate P.) 


Let IBT(c) be the increasing binary tree canonically associated to o. Then one has the funda- 
mental Equivalence Principle, 


shape -1 
IBT(o) = BST(o -), 
she 
where A> =. B means that A and B have identical tree shapes. (Hint: relate the trees to the 
cartesian representation of permutations [538, 600], as in Example II.17, p. 143.) dq 


Il. 7.3. Implicit structures. For implicit structures defined by a relation of the 
form A = [4], we note that equations involving sums and products, either labelled 
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or not, are easily solved just as in the univariate case. The same remark applies for se- 
quence and set constructions: refer to the corresponding sections of Chapters I (p. 88) 
and II (p. 137). Again, the process is best understood by examples. 

Suppose for instance one wants to enumerate connected labelled graphs by the 
number of nodes (marked by z) and the number of edges (marked by u). The class K 
of connected graphs and the class G of all graphs are related by the set construction, 


G = SET(K), 


meaning that every graph decomposes uniquely into connected components. The cor- 
responding exponential BGFs then satisfy 


G(z,u) = ek @) implying  K(z,u) = logG(z,u), 


since the number of edges in a graph is inherited (additively) from the corresponding 
numbers in connected components. Now, the number of graphs of size n having k 
edges is Guu). so that 


(67) K(z,u) = e() + ya + one-one) : 


n=1 


This formula, which appears as a refinement of the univariate formula of Chapter IT 
(p. 138), then simply reads: connected graphs are obtained as components (the log 
operator) of general graphs, where a general graph is determined by the presence or 
absence of an edge (corresponding to (1+u)) between any pair of nodes (the exponent 
n(n — 1)/2). 

To pull information out of the formula (67) is, however, not obvious due to the 
alternation of signs in the expansion of log(1 + w) and due to the strongly divergent 
character of the involved series. As an aside, we note here that the quantity 


K(z,u) =K (=.x) 


enumerates connected graphs according to size (marked by z) and excess (marked 
by uw) of the number of edges over the number of nodes. This means that the results 
of Note 11.23 (p. 135), obtained by Wright’s decomposition, can be rephrased as the 
expansion (within C(u)[[z]]): 


oo n,,—n 
es zu 1 
logf 1+ Dod + uy" DV?P-—_ Jo = —W_i@) + Woz) + 
(68) n! u 


n=l 
=, Eee ice ip — ir?) 4 
= ae 2 5 et =7- 9° 4 


with T = T(z). See Temperley’s early works [573, 574] as well as the “giant paper on 
the giant component” [354] and the paper [254] for direct derivations that eventually 
constitute analytic alternatives to Wright’s combinatorial approach. 


Example 111.24. Smirnov words. Following the treatment of Goulden and Jackson [303], we 
define a Smirnov word to be any word that has no consecutive equal letters. Let W = SEQ(A) 
be the set of words over the alphabet A = {a ,...,a,} of cardinality r, and S be the set of 
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Smimov words. Let also v; mark the number of occurrences of the jth letter in a word. One 


has> 


1 
1— (vo, +---+0,) 
Start from a Smirnov word and substitute for any letter a; that appears in it an arbitrary non- 
empty sequence of letters aj. When this operation is done at all places of a Smirnov word, 
it gives rise to an unconstrained word. Conversely, any word can be associated to a unique 
Smirnov word by collapsing into single letters maximal groups of contiguous equal letters. In 
other terms, arbitrary words are derived from Smirnov words by a simultaneous substitution: 


W(v1,..-,07) = 


W = S[a1 > SEQs1 {aq}, ... ,ar  SEQs1{ar}]. 


This leads to the relation 


(69) Wor 07) = S( ee Be ). 


l—v,’ ? l=», 
This relation determines the MGF S(v1,..., v,) implicitly. Now, since the inverse function of 
v/(1 — v) is v/(1 + v), one finds the solution: 
=] 


’ 
VI Or vj 
70 S(v1,---5 =W Rated ={1- 
(70) (01, ..., Dr) (fo ee) Pia, 
j= 


For instance, if we set v jus that is, we “forget” the composition of the words into letters, 
we obtain the OGF of Smirnov words counted according to length as 


1 1+z -1 
= =1+ r(r— 1)" "2", 
=f 1-(r-lz 2 


This is consistent with elementary combinatorics since a Smirnov word of length n is deter- 
mined by the choice of its first letter (r possibilities) followed by a sequence of n — 1 choices 
constrained to avoid one letter among r (and corresponding to r — 1 possibilities for each po- 
sition). The interest of (70) is to apply equally well to the Bernoulli model where letters may 
receive unequal probabilities and where a direct combinatorial argument does not appear to be 
easy: it suffices to perform the substitution v; +> p;z in this case: see Example IV.10, p. 262 
and Note V.11, p. 311, for applications to asymptotics. 

From these developments, one can next build the GF of words that never contain more 
than m consecutive equal letters. It suffices to effect in (70) the substitution vj +> vj + 
veep of. In particular for the univariate problem (or, equivalently, the case where letters are 
equiprobable), one finds the OGF 


1 1 — zt 


= l-rz+(r- I)gntl- 


L425 —_ 
This extends to an arbitrary alphabet the analysis of single runs and double runs in binary words 
that was performed in Subsection I. 4.1, p. 51. Naturally, the present approach applies equally 
well to non-uniform letter probabilities and to a collection of run-length upper-bounds and 
lower-bounds dependent on each particular letter. This topic is in particular pursued by different 
methods in several works of Karlin and coauthors (see, e.g., [446]), themselves motivated by 
applications to:life sciences: ss ccc) olin atees ag $5 nee e ad gig Salende da Sa RDAs Oh aeane ta b8 |_| 


5The variable z marking length, being redundant, is best omitted in this calculation. 
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> IN1.34. Enumeration in free groups. Consider the composite alphabet B = A U A, where 
A = {a},..., ar} and A = {@,...,G@;}. A word over alphabet B is said to be reduced if it 
arises from a word over B by a maximal application of the reductions aja; +> € and aja; +> € 
(with € the empty word). A reduced word thus has no factor of the form aja; or aja;. Sucha 
reduced word serves as a canonical representation of an element in the free group F, generated 


by A, upon identifying aj = az. The GF of the class R of reduced words, with u; and a; 
marking the number of occurrences of letter a j and aj, respectively, is 
uy uy Ur uy 
Feng an Ie 
l-u,; l-uy 1 — uy =) 


Resco) = 8 ( 


where S is the GF of Smirnov words, as in (70). In particular this gives the OGF of reduced 
words with z marking length as R(z) = (1+z)/(— (2r — 1)z); this implies Ry, = 2r(2r —1)”, 
which matches the result given by elementary combinatorics. 


The Abelian image /(w) of an element w of the free group F; is obtained by letting all 


1 


letters commute and applying the reductions a; - ay = 1. It can then be put under the form 


ay! ee ay, with each m j in Z, so that it can be identified with an element of Z’. Let x = 
(x1,...,%Xr) be a vector of indeterminates and define x’) to be the monomial ue eee hes 


Of interest in certain group-theoretic investigations is the MGF of reduced words 
=] =] 
ZX] zx ZX 2x, 
O(z: x) := DS dite) = 5 = + 1 =e 4 r =). 
weR ee 1- oXy es 1 — 2x, 


which is found to simplify to 


1-22 


1-2 Dj +27!) + Or - Ye? 


This last form appears in a paper of Rivin [514], where it is obtained by matrix techniques. 
Methods developed in Chapter IX can then be used to establish central and local limit laws 
for the asymptotic distribution of A(w) over Rn, providing an alternative to the methods of 
Rivin [514] and Sharp [539]. (This note is based on an unpublished memo of Flajolet, Noy, and 
Ventura, 2006.) <i 


> III.35. Carlitz compositions II. Here is an alternative derivation of the OGF of Carlitz 
compositions (Note III.32, p. 201). Carlitz compositions with largest summand < r are obtained 


from the OGF of Smirnov words by the substitution vj +> z/: 
-1 


Q(z; x) = 


r 


(71) Kg =(1-> oe 
j= 


zi 
(72) K@=[1->> . 
The asymptotic form of the coefficients is derived in Chapter IV, p. 263. dq 


Il. 7.4. Inclusion-exclusion. Inclusion—-exclusion is a familiar type of reason- 
ing rooted in elementary mathematics. Its principle, in order to count exactly, consists 
in grossly overcounting, then performing a simple correction of the overcounting, then 
correcting the correction, and so on. Characteristically, enumerative results provided 
by inclusion exclusion involve an alternating sum. We revisit this process here in the 
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perspective of multivariate generating functions, where it essentially reduces to a com- 
bined use of substitution and implicit definitions. Our approach follows Goulden and 
Jackson’s encyclopaedic treatise [303]. 

Let € be a set endowed with a real- or complex-valued measure | - | in such a way 
that, for A, B C €, there holds 


|AU B| = |A| + |B whenever ANB=Q9. 


Thus, | - | is an additive measure, typically taken as set cardinality (i.e., |e] = 1 for 
e € E) ora discrete probability measure on € (i.e., |e| = pe fore € E). The general 
formula 

|AU B| = |A| + |B| — |AB| where AB:=ANB, 
follows immediately from basic set-theoretic principles: 


> lel= Do lal+ 5 l- >> lil. 
ceAUB aecA beB ie€ANB 


What is called the inclusion—exclusion principle or sieve formula is the following mul- 
tivariate generalization, for an arbitrary family A,,..., A; C E: 


|A1U---UA;| = |E\ (A1A2--- A,)| 
(73) => Al- SA Anl +--+ (HD ALA Ard, 


1l<i<r 1<i, <ig<r 


where A := € \ A denotes complement. (The easy proof by induction results from el- 
ementary properties of the boolean algebra formed by the subsets of €; see, e.g., [129, 
Ch. IV].) An alternative formulation results from setting Bj = A i B j= Ay: 


(74) |BiBo--- Br] =|El— >) (Bilt >) [Bi Bal—---+(-1)"1B1B2--- By. 
l<i<r 1<i, <in<r 

In terms of measure, this equality quantifies the set of objects satisfying exactly a 

collection of simultaneous conditions (all the B;) in terms of those that violate at 

least some of the conditions (the B;). 


Derangements. Here is a textbook example of an inclusion—exclusion argument, 
namely, the enumeration of derangements. Recall that a derangement (p. 122) is a 
permutation o such that o; ¥ i, for all i. Fix € as the set of all permutations of [1, 7], 
take the measure | - | to be set cardinality, and let B; be the subset of permutations in € 
associated to the property o; # i. (There are consequently r = n conditions.) Thus, 
B; means having no fixed point at i, while B; means having a fixed point at the distin- 
guished value i. Then, the left-hand side of (74) gives the number of permutations that 
are derangements; that is, D,. As regards the right-hand side, the kth sum comprises 


itself (7) terms counting possibilities attached to the choices of indices ij <--- < ig; 
each such choice is associated to a factor B;, --- B;, that describes all permutations 
with fixed points at the distinguished points i),...,ix (Le, 0(4)) =i1,..., Gi, = ik). 


Clearly, |B;, --- B;,| = (n — k)!. Therefore one has 


piaal= (i) = jt (5) =) ee -1y"("Jo. 
1 2 n 
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which rewrites into the more familiar form 


This gives an elementary derivation of the derangement numbers already encountered 
in Chapter II and obtained there by means of the labelled set and cycle constructions. 


Symbolic inclusion—exclusion. The derivation above is perfectly fine but com- 
plex examples may represent somewhat of a challenge. In contrast, as we now explain, 
there exists a symbolic alternative based on multivariate generating functions, which 
is technically easy and has great versatility. 

Let us now re-examine derangements in a generating function perspective. Con- 
sider the set P of all permutations and build a superset Q as follows. The set Q 
is comprised of permutations in which an arbitrary number of fixed points—some, 
possibly none, possibly all—have been distinguished. (This corresponds to arbitrary 
products of the B; in the argument above.) For instance Q contains elements like 


1,3,2, 1,3,2, 12,3, 1,2,3, 1,2,3, 1,2,3, 


where distinguished fixed points are underlined. Clearly, if one removes the distin- 
guished elements of a y € Q, what is left constitutes an arbitrary permutation of the 
remaining elements. One has 
Q=Ux«P, 

where U/ denotes the class of urns that are sets of atoms. In particular, the EGF of Q 
is O(z) = e*/(1 — z). (What we have just done is to enumerate the quantities that 
appear in (74), but with the signs “wrong”, i.e., all pluses.) 

Introduce now the variable v to mark the distinguished fixed points in objects 
of Q. The exponential BGF is then, by the general principles of this chapter, 


Q(z, v) =e” = 


1-z 


Let now P(z,u) be the BGF of permutations where u marks the number of fixed 
points. Permutations with some fixed points distinguished are generated by the substi- 
tution ub» 1 +0 inside P(z, u). In other words one has the fundamental relation 


Q(z,0) = P@,1+0). 
This is then immediately solved to give 
PC, u) _ OC, u— 1), 


so that knowledge of (the easy) Q gives (the harder) P. For the case at hand, this 
yields 


eu-Dz —Z 


, P@,0)=De)=——, 
l-z 1-z 


and, in particular, the EGF of derangements has been retrieved. Note that the de- 
sired quantity P(z, 0) comes out as Q(z, —1), so that signs corresponding to the sieve 
formula (74) have now been put “right”, i.e., alternating. 


P(z,u) = 
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The process employed for derangements is clearly very general: counting objects 
that contain an exact number of “patterns” is reduced to counting objects that con- 
tain the pattern at distinguished places—the latter is usually a simpler problem. The 
generating function analogue of inclusion—exclusion is then simply the substitution 
vo + u — 1, if a bivariate GF is sought, or v + —1 in the univariate case, when 
patterns are altogether to be excluded. 


Rises in permutations and patterns in words. The book by Goulden and Jack- 
son [303, pp. 45-48] describes a useful formalization of the inclusion process operat- 
ing on MGFs. Conceptually, it combines substitution and implicit definitions, just as 
in the case of derangements above. Again, the modus operandi is best grasped through 
examples, two of which are detailed now. 


Example Y1.25. Rises and ascending runs in permutations. A rise (also called an ascent) 
in a permutation o = o| --+ op is a pair of consecutive elements ojo; Satisfying oj < oj41 
(with 1 < i < n). The problem is to determine the number A, ; of permutations of size 
having exactly k rises, together with the exponential BGF A(z, uw). By symmetry, we are also 
enumerating descents (defined by oj; > oj+41) as well as ascending runs that are each terminated 
by a descent. 

Guided by the inclusion—exclusion principle, we tackle the easier problem of enumerating 
permutations with distinguished rises, of which the set is denoted by B. For instance, 6 contains 
elements such as 


261)3 74/78 79 ALI 15 12] 5710 [137 14, 


where those rises that are distinguished are represented by arrows. (Note that some rises may 
not be distinguished.) Maximal sequences of adjacent distinguished rises (boxed in the repre- 
sentation) will be called clusters. Then, B can be specified by the sequence construction applied 
to atoms (Z) and clusters (C) as 


B=SEQ(Z+C), where C=(ZAZ)+(Z AZ / Z)+--- =SET3s7(Z). 
since a cluster is an ordered sequence, or equivalently a set, furthermore having at least two 
elements. This gives the EGF of B as 

1 _ ot 
l-(¢t(e—-1—-z)) 2-e’ 
which happens to coincide with the EGF of surjections. 


For inclusion—exclusion purposes, we need the BGF of 6 with v marking the number of 
distinguished rises. A cluster of size k contains k — 1 rises, so that 


1 Dv 
1— (z+ (e —1—z)/v) vo +1 — 2%" 


Now, the usual argument applies: the BGF A(z, u) satisfies B(z,v) = A(z, 1+ 0), so that 
A(z, u) = B(z,u — 1), which yields the particularly simple form 


Biz) = 


B(z,v) = 


u-—1l 
u — ezu-l)* 


(75) A(Z, u) = 


In particular, this GF expands as 


2 3 4 
Zz Zz Zz 
AG w=1+z+@+ DF +@?+4u4+ DE ++ Ue? + let DE + 
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The coefficients A, , are known as the Eulerian numbers (Invitation, p. 9). In combinatorial 

analysis, these numbers are almost as classic as the Stirling numbers; a detailed discussion of 

their properties is to be found in classical treatises such as Comtet [129] or Graham et al. [307]. 
Moments derive easily from an expansion of (75) at u = 1, which gives 


1 22 1 (242) 
A@,u) = 7 2 WT Ga oe 


— 2 see 
SSG (u 1) + . 


In particular: the mean of the number of rises in a random permutation of size n is 3(n — 1) 


and the variance is ~ ty”, ensuring concentration of distribution. 

The same method applies to the enumeration of ascending runs: for a fixed parameter ¢, 
an ascending run of length € is a sequence of consecutive elements ojo;+1 ---oj4¢ Such that 
Oj < Oj41 < +++ < oj+¢. (Thus, arise is an ascending run of length 1.) We define a cluster as a 
sequence of distinguished runs which overlap in the sense that they share some of the elements 
of the permutation. The exponential BGF of permutations with distinguished ascending runs is 
then 

1 


B(z, v0) = ————__,, 
C2) 1—z-—I(z,v) 


n 
where I/(z,v) = 5 Tn ke a? 
n,k 
and J, x is the number of ways of covering the segment [1, 1] with & distinct intervals of length ¢ 
that are contained in [1,7] and have integral end points. The numbers /;, , themselves result 
from elementary combinatorics (see also the case of patterns in words below) and one has for 


the OGF corresponding to /: 


€+1 


G 2) 


L-o(zt224+---42°)" 
(Proof: The first segment in the covering must be placed on the left, the others appear in suc- 
cession, each shifted right by 1 to ¢ positions from the previous one.) The last two equations 


finally determine the exponential BGF of permutations with size marked by z and ascending 
runs of length ¢ + 1 marked by u, 


I(z,v) = 


(76) A(z, u) = Bz, u— 1), 


given the inclusion—exclusion principle. 

The resulting formulae generalize the case of rises (€ = 1). They can be made explicit 
by first expanding the OGF / (z, v) into partial fractions, then applying the transformation (1 — 
wz)! + e® in order to translate I (z, v) into [(z, v). The net result is 


1 


c 
ey aT where 7(z,v)=(1—z)o+1)+ S cj (v)e2i 


j=l 


A(z, u) = 


involves a sum of exponentials. In this last equation, the «; (v) are the roots of the characteristic 
equation of = vo(l +--+ wo!) and the c;(v) are the corresponding coefficients in the 
partial fraction decomposition of /(z, v). These expressions were first published by Elizalde 
and Noy [190] who obtained them by means of tree decompositions. 

The BGF (76) can be exploited in order to determine quantitative information on long runs 
in permutations. First, an expansion at u = 1 (also, by a direct reasoning: see the discussion 
of hidden words in Chapter I) shows that the mean number of ascending runs of length € — 1 
is (n — € + 1)/€! exactly, as soon as n > €. This entails that, if n = o(€!), the probability of 
finding an ascending run of length f — 1 tends to 0 as n > oo. What is used in passing in this 
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argument is the general fact that for a discrete variable X with values in 0, 1,2,..., one has 
(with Iverson’s notation), 


P(X > 1) = E((X = 11) = E(min(X, 1)) < E(X). 


An inequality in the converse direction can be obtained from the second moment method. In 
effect, the variance of the number of ascending runs of length € — 1 is found to be of the exact 
form agn + fe, in which az is essentially 1/¢! and £7 is of comparable order (details omitted). 
Then, by Chebyshev’s inequalities, concentration of distribution holds as long as @ is such that 
(€+1)! = o(n). In this case, with high probability (i.e., with probability tending to 1 as n tends 
to co), there is at least one ascending run of length  — 1 (in fact, many). In particular: 


Let Ln be the length of the longest ascending run in a random permutation of n 
elements. Let €g(n) be the smallest integer such that €! > n. Then the distribu- 
tion of Ln is concentrated: Ly/€g(n) converges in probability to 1 (in the sense of 
Equation (14), p. 162). 

What has been found here is a fairly sharp threshold phenomenon. ....................-. | 


> IIL36. Permutations without €—ascending runs. The EGF of permutations without 1-, 2— 
and 3—ascending runs are respectively 


> ge gee > a > x4i Ait] 

= (2i)! (2i+1)! ai Gi)! Git! hy (4i)! (i+)! 

and so on. (See Carlitz’s review [103] as well as Elizalde and Noy’s article [190] for interesting 
results involving several types of order patterns in permutations.) 

Many variations on the theme of rises and ascending runs are clearly possible. Lo- 
cal order patterns in permutations have been intensely researched, notably by Carlitz 
in the 1970s. Goulden and Jackson [303, Sec. 4.3] offer a general theory of patterns 
in sequences and permutations. Special permutations patterns associated with binary 
increasing trees are also studied by Flajolet, Gourdon, and Martinez [235] (by com- 
binatorial methods) and Devroye [159] (by probabilistic arguments). On another reg- 
ister, the longest ascending run has been found above to be of order (log n)/ log logn 
in probability. The superficially resembling problem of analysing the length of the 
longest increasing sequence in random permutations (elements must be in ascending 
order but need not be adjacent) has attracted a lot of attention, but is considerably 
harder. This quantity is ~ 2,/n on average and in probability, as shown by a pene- 
trating analysis of the shape of random Young tableaux due to Logan and Shepp [411] 
and Vershik and Kerov [596]. Solving a problem that had been open for over 20 years, 
Baik, Deift, and Johansson [24] have eventually determined its limiting distribution. 
The undemanding survey by Aldous and Diaconis [10] discusses some of the back- 
ground of this problem, while Chapter VIII (p. 596) shows how to derive bounds that 
are of the right order of magnitude, using saddle-point methods. 


Example YII.26. Patterns in words. Take the set of all words W = SEQ{.A} over a finite 
alphabet A = {a,,..., ay}. A pattern p = p, po--- px, which is a particular word of length k 
has been fixed. What is sought is the BGF W(z,u) of W, where u marks the number of 
occurrences of pattern p inside a word of W. The results of Chapter I already give access to 
W (z, 0), which is the OGF of words not containing the pattern. 

In accordance with the inclusion—exclusion principle, one should introduce the class V of 
words augmented by distinguishing an arbitrary number of occurrences of p. Define a cluster 
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as a maximal collection of distinguished occurrences that have an overlap. For instance, if 
p = aaaaa, a particular word may give rise to the particular cluster: 


abaaaaaaaaaaaaabaaaaaaaabhb 
aaaaa 
aaaaa 
aaaaa 


Then objects of 1 decompose as sequences of either arbitrary letters from A or clusters: 


with C the class of all clusters. 

Clusters are themselves obtained by repeatedly sliding the pattern, but with the constraint 
that it should constantly overlap partly with itself. Let c(z) be the autocorrelation polynomial 
of p as defined in Chapter I (p. 61), and set ¢(z) = c(z) — 1. A moment’s reflection should 
convince the reader that zke(z)s7} when expanded describes all the possibilities for forming 
clusters of s overlapping occurrences. On the example above, one has ¢(z) = z+ 24 4c4, 
and a particular cluster of 3 overlapping occurrences corresponds to one of the terms in zke(z)? 
as follows: 


D 
—_—_—_—_——, 5 
aaaaa ,2 z 
-_— 
aaaaa z x(zte24+234 24) 
—_“_ 
aaaaa x (zt 22423424). 


The OGF of clusters is consequently C(z) = z* /(1 — @(z)) since this quantity describes all the 
ways to write the pattern (z*) and then slide it so that it should overlap with itself (this is given 


by (1 —@(z))7!). 
By a similar reasoning, the BGF of clusters is ozk /(1 — v¢(z)), and the BGF of ¥ with the 
supplementary variable v marking the number of distinguished occurrences is 
1 
l—rz—ozk/( — ve(z)) 
Finally, the usual inclusion—exclusion argument (change v to u — 1) yields W(z,u) = 
X(z,u — 1). Asa result: 


X(z,v) = 


For a pattern p with correlation polynomial c(z) and length k, the BGF of words 
over an alphabet of cardinality r, where u marks the number of occurrences of p, is 


_ (u= Nel) =u 
(1 — rz)((u — Ne(z) — u) + U— Dek 

The specialization u = 0 gives back the formula already found in Chapter I, p. 61. The same 

principles clearly apply to weighted models corresponding to unequal letter probabilities, pro- 


vided a suitably weighted version of the correlation polynomial is introduced (see Note III.39 
BElOW) =: ects Sates oF Lai a ers ht SBS eG ARE PRs || 


(77) W(z, u) 


There are a very large number of formulae related to patterns in strings. For 
instance, BGFs are known for occurrences of one or several patterns under either 
Bernoulli or Markov models; see Note III.39 below. We refer to Szpankowski’s 
book [564] and Lothaire’s chapter [347], where such questions are treated system- 
atically in great detail. Bourdon and Vallée [81] have succeeded in extending this 
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approach to dynamical sources of information, thereby uniting a large number of pre- 
viously known results. Their approach even makes it possible to analyse the occur- 
rence of patterns in continued fraction representations of real numbers. 


> IIL37. Moments of number of occurrences. The derivatives of X(z,v) at v = 0 give access 
to the factorial moments of the number of occurrences of a pattern. In this way or directly, one 
determines 


ee ae zk((1 — rz)(e(z) — 1) + 24) (uw — 1)? 


The mean number of occurrences is r~” times the coefficient of z” in the coefficient of (u — 1) 
and is (n —k+ Dr-*, as anticipated. The coefficient of (u — 1)? /2! is of the form 


Qr-2k pk 4 kr * —c(L/r)) Pz) 
d- rz) d- rz) l—rz’ 
with P a polynomial. This shows that the variance of the number of occurrences is of the form 
ant+B, a=r*(2c(1/r) —14+r7*(1 — 2h). 


Consequently, the distribution is concentrated around its mean. (See also the discussion of 
“Borges’ Theorem” in Chapter I, p. 61.) dq 


> IIL38. Words with fixed repetitions. Let W')(z) = [w']W(z,u) be the OGF of words 
containing a pattern exactly s times. One has, for s > 0 and s = 0, respectively, 
w')(z) - Zk N(z)s—} . c(z) 
D(z)st! D(z) 


W(z) = 
with N(z) and D(z) given by 


N(z) = 1 —rz)(c(z) — I + a D(z) = (l= rz)e(z) + zk 


(0) 


The expression of W‘’’ is in agreement with Chapter I, Equation (62), p. 61. dq 


> TIL.39. Patterns in Bernoulli sequences. Let A be an alphabet where letter a has probabil- 
ity zq and consider the Bernoulli model where letters in words are chosen independently. Fix a 
pattern p = p, --- px and define the finite language of protrusions as 


P= U (Pi41 Pi4+2 +++ Pk}; 
i: 740 
where the union is over all correlation positions of the pattern. Define now the correlation 


polynomial y (z) (relative to p and the z,,) as the generating polynomial of the finite language 
of protrusions weighted by (z,,). For instance, p = ababa gives rise to T = {e, ba, baba} and 


y(z) = 1+ maape? +a2aZct. 


The BGF of words with z marking length and u marking the number of occurrences of p is 


2 (w= 1)y (2) —u 
Wz, u) i k? 
CG —z)(u— Dy @—u) + @— Dalplz 
where z [p] is the product of the probabilities of letters of p. dq 


> IIl.40. Patterns in trees I. Consider the class B of pruned binary trees. An occurrence of 
pattern t in a tree z is defined by a node of t whose dangling subtree is isomorphic to t. We 
seek the BGF B(z, u) of class B where u marks the number of occurrences of t. 

The OGF of Bis B(z) = (1-1 — 4z)/(2z). The quantity v B(zv) is the BGF of B with v 
marking external nodes. By virtue of the pointing operation, the quantity 


ie (a2 (oB(=0))) 


v=1 
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describes trees with k distinct external nodes distinguished (pointed). Let m = |t|. The quantity 
V:= >) Uyuk(e™)* satisfies V = (0B(z0))pa14uzm » 


by virtue of Taylor’s formula. It is also the BGF of trees with distinguished occurrences of t 
marked by v. Setting v H u — 1 in V then gives B(z, u) as 


(78) B(z,u) = ~ (: i —47-A(u bem). 


In particular B(z,0) = oa (1 —v1l—424+ 4zm+1) represents the OGF of trees not containing 


pattern t. The method generalizes to any simple variety of trees. It can be used to prove that 
the factored representation (as a directed acyclic graph) of a random tree of size n has expected 


size O(n/,/log n). (These results appear in [257]; see also Example IX.26, p. 680, for a related 
Gausian law.) J 


> III.41. Patterns in trees II. Here follows an alternative derivation of (78) that is based on the 
root decomposition of trees. A pattern t occurs either in the left root subtree tg, or in the right 
root subtree 71, or at the root iself in the case in which t coincides with t. Thus the number 
a(t] of occurrences of t in 7 satisfies the recursive definition 


ot] = e[to] + oft] +z = tH, a[9] = 0. 
The function u@!*! is almost multiplicative, and 


ylt] = ylt=tloltol elt) — ,oltol oli) 4 pr = t]- (u— 1). 
Thus, the bivariate generating function B(z, u) := >°; zlly@l] satisfies the quadratic equation, 


B(z,u) = 1+ (w= 1)” + zB(z, u)’, 
which, when solved, yields (78). <q 


III. 8. Extremal parameters 


Apart from additively inherited parameters already examined at length in this 
chapter, another important category is that of parameters defined by a maximum rule. 
Two major cases are the largest component in a combinatorial structure (for instance, 
the largest cycle of a permutation) and the maximum degree of nesting of construc- 
tions in a recursive structure (typically, the height of a tree). In this case, bivariate 
generating functions are of little help, because of the nonlinear character of the max- 
function. The standard technique consists in introducing a collection of univariate 
generating functions defined by imposing a bound on the parameter of interest. Such 
GFs can then be constructed by the symbolic method in its univariate version. 


III. 8.1. Largest components. Consider a construction B = ®[.A], where ® 
may involve an arbitrary combination of basic constructions, and assume here for 
simplicity that the construction for B is a non-recursive one. This corresponds to a 
relation between generating functions 


B(z) = P[A(z)], 


where 'P is the functional that is the “image” of the combinatorial construction ®. 
Elements of A thus appear as components in an object 8 € B. Let B'”’ denote the 
subclass of 6 formed with objects whose A-components all have a size at most b. The 
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GF of B'’) is obtained by the same process as that of B itself, save that A(z) should 
be replaced by the GF of elements of size at most b. Thus, 
BO (2) = ¥IT,AG)I, 


where the truncation operator is defined on series by 


b oo 
Rio=> A CO=> 40): 
n=0 n=0 
Example 111.27. A pot-pourri of largest components. Several instances of largest components 
have already been analysed in Chapters I and II. For instance, the cycle decomposition of 
permutations translated by 


P=Set(Cyc(Z)) => Pz) = exp (108 I = ) 


gives more generally the EGF of permutations with longest cycle < b, 


2 b 
Pir =esn( f+ Stet =). 


which involves the truncated logarithm. 
The labelled specification of words over an m—ary alphabet 
W = SET» (SET(Z)) => W(z) = (e%)” 
leads to the EGF of words such that each letter occurs at most b times: 


ys) b m 
(b) (5) — CE 9 hE ON Mig ee ty OO 
W o=(145+54 Ta ; 


which now involves the truncated exponential. Similarly, the EGF of set partitions with largest 


block of size at most b is 
2 b 
(b) (7) = BOERS GS. ey 
S o=e0( +54 cae : 


A slightly less direct example is that of the longest run in a binary string (p. 51), which we 
now revisit. The collection W of binary words over the alphabet {a, b} admits the unlabelled 
specification 

W = SEQ(a) - SEQ(b SEQ(a)), 
corresponding to a “scansion” dictated by the occurrences of the letter b. The corresponding 
OGF then appears under the form 


1 1 
W(z) = Y(z) - ————~, where Y(z) = —— 
@) @) 1 — zY(z) @) l-z 
corresponds to YV = SEQ(a). Thus, the OGF of strings with at most k — 1 consecutive occur- 

rences of the letter a obtains upon replacing Y (z) by its truncation: 


WO =O — EG whe YO Qate se eae’ 
so that ‘ 
1—z 
WOO 
1—2z4 2k+1 


An asymptotic analysis is given in Example V.4, p. 308. ......... 0... eee ee eee eee eee a 
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Generating functions for largest components are thus easy to derive. The asymp- 
totic analysis of their coefficients is however often hard when compared to additive 
parameters, owing to the need to rely on complex analytic properties of the truncation 
operator. The bases of a general asymptotic theory have been laid by Gourdon [305]. 
> IIL.42. Smallest components. The EGF of permutations with smallest cycle of size > b is 


! ex oe ee 
tae 18 b) 


A symbolic theory of smallest components in combinatorial structures is easily developed as 
regards formal GFs. Elements of the corresponding asymptotic theory are provided by Panario 
and Richmond in [470]. J 


Ill. 8.2. Height. The degree of nesting of a recursive construction is a general- 
ization of the notion of height in the simpler case of trees. Consider for instance a 
recursively defined class 

B= O[B], 
where ® is a construction. Let B!”! denote the subclass of B composed solely of ele- 
ments whose construction involves at most h applications of ®. We have by definition 
pier o = opi}, 


Thus, with ¥ the image functional of construction ®, the corresponding GFs are de- 
fined by a recurrence, 

plethl — wr Bly, 
(This discussion is related to the semantics of recursion, p. 33.) 


Example I1.28. Generating functions for tree height. Consider first general plane trees: 


G=ZxSEQG) = GO = Ep: 


Define the height of a tree as the number of edges on its longest branch. Then the set of trees of 
height < h satisfies the recurrence 


gl] - Z, glh+1] a SEQ(GH), 
Accordingly, the OGF of trees of bounded height satisfies 


[0l(-) — [ht+1](5) = £ 
G'(z) =z, G (z) = TG: 
The recurrence unwinds and one finds 
(79) Gl = —— 
= = z 
1-z 


where the number of stages in the fraction equals b. This is the finite form (technically known 
as a “convergent”) of a continued fraction expansion. From implied linear recurrences and 
an analysis based on Mellin transforms, de Bruijn, Knuth, and Rice [145] have determined the 
average height of a general plane tree to be ~ ./an. We provide a proof of this fact in Chapter V 
(p. 329) dedicated to applications of rational and meromorphic asymptotics. 
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For plane binary trees defined by 
B=Z+BxB  sothat B(z)=z+(B(z))’, 
(size here is the number of external nodes), the recurrence is 
BUN) =z, BUN (z) =z 4 (BEI ())?. 
In this case, the B!"! are the approximants to a “continuous quadratic form’, namely 
Bg) =zt +46). 


These are polynomials of degree 2" for which no closed form expression is known, nor even 
likely to exist®. However, using complex asymptotic methods and singularity analysis, Flajolet 
and Odlyzko [246] have shown that the average height of a binary plane tree is ~ 2,/zn. See 
Subsection VII. 10.2, p. 535 for the sketch of a proof. 

For Cayley trees, finally, the defining equation is 


T = Z*SET(T) = T(z) = ze? ®), 
The EGF of trees of bounded height satisfy the recurrence 
Tole) =z, TUN (2) = eT), 
We are now confronted with a “continuous exponential”, 
= ze* 
TlAl(z) - ggaers 


The average height was found by Rényi and Szekeres who appealed again to complex analytic 
methods and found it to be ~ V2. 1... cece nent e enn eee | 


These examples show that height statistics are closely related to iteration theory. 
Except in a few cases like general plane trees, normally no algebra is available and 
one has to resort to complex analytic methods as expounded in forthcoming chapters. 


Ill. 8.3. Averages and moments. For extremal parameters, the GFs of mean val- 
ues obey a general pattern. Let F be some combinatorial class with GF f(z). Consider 
for instance an extremal parameter y such that f!"!(z) is the GF of objects with y- 
parameter at most h. The GF of objects for which y = h exactly is equal to 


FM@ = Fe). 


Thus differencing gives access to the probability distribution of height over 7. The 
generating function of cumulated values (providing mean values after normalization) 
is then 


=(z) 


[PM @ - feM@)] 
h=0 


= D[r@- Mel, 


h=0 
as is readily checked by rearranging the second sum, or equivalently using summation 
by parts. 


These polynomials are exactly the much-studied Mandelbrot polynomials whose behaviour in the 
complex plane gives rise to extraordinary graphics (Figure VII.23, p. 536). 
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For the largest components, the formulae involve truncated Taylor series. For 
height, analysis involves in all generality the differences between the fixed point of a 
functional ® (the GF f(z)) and the approximations to the fixed point (f'"!(z)) pro- 
vided by iteration. This is acommon scheme in extremal statistics. 


> III.43. The height of increasing binary trees. Given the specification of increasing binary 
trees in Equation (61), p. 143, the EGF of trees of height at most h is given by the recurrence 


z 
1l¢) =1, plh+l(z) =1 +/ Tw? dw. 
0 


Devroye [157, 158] showed in 1986 that the expected height of a tree of size n is asymptotic 
to clogn where c = 4.31107 is a solution of clog((2e)/c) = 1. 


> Ill.44. Hierarchical partitions. Let ¢(z) = e* — 1. The generating function 
é(e(--- (e(z)))) (h times). 


can be interpreted as the EGF of certain hierarchical partitions. (Such structures show up in 
statistical classification theory [585, 586].) J 


> IIL45. Balanced trees. Balanced structures lead to counting GFs close to the ones obtained 
for height statistics. The OGF of balanced 2-3 trees of height counted by the number of leaves 
satisfies the recurrence 


Zln+ (zy _ Zihl (22 4 z3) os (z!1(z))? + (z!1(z))3, 


which can be expressed in terms of the iterates of 0 (z) = 2? +23 (see Note 1.67, p. 91, as well 
as Chapter IV, p. 281, for asymptotics). It is possible to express the OGF of cumulated values 
of the number of internal nodes in such trees in terms of the iterates of o. 


> III.46. Extremal statistics in random mappings. One can express the EGFs relative to the 
largest cycle, longest branch, and diameter of functional graphs. Similarly for the largest tree, 
largest component. [Hint: see [247] for details.] 


> IL.47. Deep nodes in trees. The BGF giving the number of nodes at maximal depth in 
a general plane tree or a Cayley tree can be expressed in terms of a continued fraction or a 
continuous exponential. 


IlI.9. Perspective 


The message of this chapter is that we can use the symbolic method not just to 
count combinatorial objects but also to quantify their properties. The relative ease 
with which we are able to do so is testimony to the power of the method as a major 
organizing principle of analytic combinatorics. 

The global framework of the symbolic method leads us to a natural structural cat- 
egorization of parameters of combinatorial objects. First, the concept of inherited pa- 
rameters permits a direct extension of the already seen formal translation mechanisms 
from combinatorial structures to GFs, for both labelled and unlabelled objects—this 
leads to MGFs useful for solving a broad variety of classical combinatorial problems. 
Second, the adaptation of the theory to recursive parameters provides information 
about trees and similar structures, this even in the absence of explicit representations 
of the associated MGFs. Third, extremal parameters, which are defined by a maxi- 
mum rule (rather than an additive rule), can be studied by analysing families of uni- 
variate GFs. Yet another illustration of the power of the symbolic method is found in 
the notion of complete GF, which in particular enables us to study Bernoulli trials and 
branching processes. 


III. 9. PERSPECTIVE 219 


As we shall see starting with Chapter IV, these approaches become especially 
powerful since they serve as the basis for the asymptotic analysis of properties of 
structures. Not only does the symbolic method provide precise information about 
particular parameters, but it also paves the way for the discovery of general schemas 
and theorems that tell us what to expect about a broad variety of combinatorial types. 


Bibliographic notes. Multivariate generating functions are a common tool from classical com- 
binatorial analysis. Comtet’s book [129] is once more an excellent source of examples. A 
systematization of multivariate generating functions for inherited parameters is given in the 
book by Goulden and Jackson [303]. 

In contrast generating functions for cumulated values of parameters (related to averages) 
seemed to have received relatively little attention until the advent of digital computers and 
the analysis of algorithms. Many important techniques are implicit in Knuth’s treatises, es- 
pecially [377, 378]. Wilf discusses related issues in his book [608] and the paper [606]. 
Early systems specialized to tree algorithms were proposed by Flajolet and Steyaert in the 
1980s [215, 261, 262, 560]; see also Berstel and Reutenauer’s work [56]. Some of the ideas 
developed there initially drew their inspiration from the well-established treatment of formal 
power series in non-commutative indeterminates; see the books by Eilenberg [189] and Sa- 
lomaa and Soittola [527] as well as the proceedings edited by Berstel [54]. Several compu- 
tations in this area can nowadays even be automated with the help of computer algebra sys- 
tems [255, 528, 628]. 


Je n’ai jamais été assez loin pour bien sentir l’application de I’algébre a la géométrie. Je 
n’aimais point cette maniére d’opérer sans voir ce qu’on fait, et il me sembloit que résoudre un 
probléme de géométrie par les équations, c’étoit jouer un air en tournant une manivelle. 


(“TI never went far enough to get a good feel for the application of algebra to geometry. I was not pleased 
with this method of operating according to the rules without seeing what one does; solving geometrical 
problems by means of equations seemed like playing a tune by turning a crank.”) 


— JEAN-JACQUES ROUSSEAU, Les Confessions, Livre VI 


Part B 


COMPLEX ASYMPTOTICS 


IV 


Complex Analysis, Rational and 
Meromorphic Asymptotics 


Entre deux vérités du domaine réel, le chemin le plus facile et le plus court 
passe bien souvent par le domaine complexe. 


PAUL PAINLEVE [467, p. 2] 


It has been written that 
the shortest and best way between two truths of the real domain 


often passes through the imaginary one!. 


— JACQUES HADAMARD (316, p. 123] 
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Generating functions are a central concept of combinatorial theory. In Part A, we have 
treated them as formal objects; that is, as formal power series. Indeed, the major theme 
of Chapters I-III has been to demonstrate how the algebraic structure of generating 
functions directly reflects the structure of combinatorial classes. From now on, we 
examine generating functions in the light of analysis. This point of view involves 
assigning values to the variables that appear in generating functions. 

Comparatively little benefit results from assigning only real values to the vari- 
able z that figures in a univariate generating function. In contrast, assigning complex 
values turns out to have serendipitous consequences. When we do so, a generating 
function becomes a geometric transformation of the complex plane. This transforma- 
tion is very regular near the origin—one says that it is analytic (or holomorphic). In 
other words, near 0, it only effects a smooth distortion of the complex plane. Farther 
away from the origin, some cracks start appearing in the picture. These cracks—the 
dignified name is singularities—correspond to the disappearance of smoothness. It 
turns out that a function’s singularities provide a wealth of information regarding the 
function’s coefficients, and especially their asymptotic rate of growth. Adopting a 
geometric point of view for generating functions has a large pay-off. 


'Hadamard’s quotation (1945) is a free rendering of the original one due to Painlevé (1900); namely, 
“The shortest and easiest path betwen two truths of the real domain most often passes through the complex 
domain.” 
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By focusing on singularities, analytic combinatorics treads in the steps of many 
respectable older areas of mathematics. For instance, Euler recognized that for the 
Riemann zeta function ¢(s) to become infinite (hence have a singularity) at 1 im- 
plies the existence of infinitely many prime numbers; Riemann, Hadamard, and de la 
Vallée-Poussin later uncovered deep connections between quantitative properties of 
prime numbers and singularities of 1/¢(s). 


The purpose of this chapter is largely to serve as an accessible introduction or 
a refresher of basic notions regarding analytic functions. We start by recalling the 
elementary theory of functions and their singularities in a style tuned to the needs of 
analytic combinatorics. Cauchy’s integral formula expresses coefficients of analytic 
functions as contour integrals. Suitable uses of Cauchy’s integral formula then make 
it possible to estimate such coefficients by suitably selecting an appropriate contour 
of integration. For the common case of functions that have singularities at a finite 
distance, the exponential growth formula relates the /ocation of the singularities clos- 
est to the origin—these are also known as dominant singularities—to the exponential 
order of growth of coefficients. The nature of these singularities then dictates the fine 
structure of the asymptotics of the function’s coefficients, especially the subexponen- 
tial factors involved. 


As regards generating functions, combinatorial enumeration problems can be 
broadly categorized according to a hierarchy of increasing structural complexity. At 
the most basic level, we encounter scattered classes, which are simple enough, so that 
the associated generating function and coefficients can be made explicit. (Examples of 
Part A include binary and general plane trees, Cayley trees, derangements, mappings, 
and set partitions). In that case, elementary real-analysis techniques usually suffice 
to estimate asymptotically counting sequences. At the next, intermediate, level, the 
generating function is still explicit, but its form is such that no simple expression is 
available for coefficients. This is where the theory developed in this and the next chap- 
ters comes into play. It usually suffices to have an expression for a generating function, 
but not necessarily its coefficients, so as to be able to deduce precise asymptotic esti- 
mates of its coefficients. (Surjections, generalized derangements, unary—binary trees 
are easily subjected to this method. A striking example, that of trains, is detailed in 
Section IV. 4.) Properties of analytic functions then make this analysis depend only on 
local properties of the generating function at a few points, its dominant singularities. 
The third, highest, level, within the perspective of analytic combinatorics, comprises 
generating functions that can no longer be made explicit, but are only determined by a 
functional equation. This covers structures defined recursively or implicitly by means 
of the basic constructors of Part A. The analytic approach even applies to a large 
number of such cases. (Examples include simple families of trees, balanced trees, 
and the enumeration of certain molecules treated at the end of this chapter. Another 
characteristic example is that of non-plane unlabelled trees treated in Chapter VII.) 


As we shall see throughout this book, the analytic methodology applies to almost 
all the combinatorial classes studied in Part A, which are provided by the symbolic 
method. In the present chapter we carry out this programme for rational functions and 
meromorphic functions (i.e., functions whose singularities are poles). 
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IV.1. Generating functions as analytic objects 


Generating functions, considered in Part A as purely formal objects subject to al- 
gebraic operations, are now going to be interpreted as analytic objects. In so doing one 
gains easy access to the asymptotic form of their coefficients. This informal section 
offers a glimpse of themes that form the basis of Chapters [V—VII. 


In order to introduce the subject, let us start with two simple generating functions, 
one, f(z), being the OGF of the Catalan numbers (cf G(z), p. 35), the other, g(z), 
being the EGF of derangements (cf D(z), p. 123): 


1 exp(—Z 
(1) fQ=5(I-vI=®), =P. 
At this stage, the forms above are merely compact descriptions of formal power series 
built from the elementary series 


i. i 

(-yy* = ltyty? +>, GQ-y? = 1-sy-gy-- 
Ae, a hos 

exp(y) = Det ae Pay fee, 


by standard composition rules. Accordingly, the coefficients of both GFs are known 
in explicit form: 


, n 1 (2n—2 . n 1 1 (-1)" 


Stirling’s formula and the comparison with the alternating series giving exp(—1) pro- 
vide, respectively, 


qn-l 
(2) tn ~ 


no 4/7 3 


: 8n = ~ ec | = 0.36787. 
n> oo 


Our purpose now is to provide intuition on how such approximations could be 
derived without appealing to explicit forms. We thus examine, heuristically for the 
moment, the direct relationship between the asymptotic forms (2) and the structure of 
the corresponding generating functions in (1). 


Granted the growth estimates available for f,, and gy, it is legitimate to substitute 
in the power series expansions of the GFs f(z) and g(z) any real or complex value of 
a small enough modulus, the upper bounds on modulus being pr = 1/4 (for f) and 
Peg = | (for g). Figure IV.1 represents the graph of the resulting functions when such 
real values are assigned to z. The graphs are smooth, representing functions that are 
differentiable any number of times for z interior to the interval (—p, +p). However, 
at the right boundary point, smoothness stops: g(z) become infinite at z = 1, and so it 
even ceases to be finitely defined; f(z) does tend to the limit 5 as Z > GG). but its 
derivative becomes infinite there. Such special points at which smoothness stops are 
called singularities, a term that will acquire a precise meaning in the next sections. 

Observe also that, in spite of the series expressions being divergent outside the 
specified intervals, the functions f(z) and g(z) can be continued in certain regions: it 
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0.54 


0.4 4 
0.3 4 


0.2 4 


0.2 


rg ase, 
Figure IV.1. Left: the graph of the Catalan OGF, f(z), for z € (— i, +4); right: the 
graph of the derangement EGF, g(z), for z € (-1,+1). 


suffices to make use of the global expressions of Equation (1), with exp and ./ being 
assigned their usual real-analytic interpretation. For instance: 


feH= ; (1 = V5), g(—2) = a 


Such continuation properties, most notably to the complex realm, will prove essential 
in developing efficient methods for coefficient asymptotics. 


One may proceed similarly with complex numbers, starting with numbers whose 
modulus is less than the radius of convergence of the series defining the GF. Fig- 
ure IV.2 displays the images of regular grids by f and g, as given by (1). This illus- 
trates the fact that a regular grid is transformed into an orthogonal network of curves 
and more precisely that f and g preserve angles—this property corresponds to com- 
plex differentiability and is equivalent to analyticity to be introduced shortly. The 
singularity of f is clearly perceptible on the right of its diagram, since, at z = 1/4 
(corresponding to f(z) = 1/2), the function f folds lines and divides angles by a 
factor of 2. The singularity of g at z = 1 is indirectly perceptible from the fact that 
g(z) > oo as z > | (the square grid had to be truncated at z = 0.75, since this book 
can only accommodate finite graphs). 


Let us now turn to coefficient asymptotics. As is expressed by (2), the coefficients 
fn and gy, each belong to a general asymptotic type for coefficients of a function F, 
namely, 


(3) [c"]F(z) = A"A(n), 
corresponding to an exponential growth factor A” modulated by a tame factor O(n), 
which is subexponential. Here, one has A = 4 for f, and A = 1 for g,; also, 


O(n) ~ rAd mn>)—! for f, and O(n) ~ e7! for g,. Clearly, A should be related 
to the radius of convergence of the series. We shall see that, invariably, for combi- 
natorial generating functions, the exponential rate of growth is given by A = 1/p, 
where p is the first singularity encountered along the positive real axis (Theorem IV.6, 
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Figure IV.2. The images of regular grids by f(z) (left) and g(z) (right). 


p. 240). In addition, under general complex analytic conditions, it will be established 
that O(n) = O(1) is systematically associated to a simple pole of the generating func- 
tion (Theorem IV.10, p. 258), while 9(n) = O(n~*/?) systematically arises from a 
singularity that is of the square-root type (Chapters VI and VII). We enunciate: 


First Principle of Coefficient Asymptotics. The location of a function’s 
singularities dictates the exponential growth (A”) of its coefficients. 
Second Principle of Coefficient Asymptotics. The nature of a function’s 
singularities determines the associate subexponential factor (@(n)). 


Observe that the rescaling rule, 
[2"]F(z) = p "Ie" F (pz), 


enables one to normalize functions so that they are singular at 1. Then, various the- 
orems, starting with Theorems IV.9 and IV.10, provide sufficient conditions under 
which the following fundamental implication is valid, 


(4) h(z)~o(z) = [z”" hz) ~ [z"]o(z). 


There h(z), whose coefficients are to be estimated, is a function singular at 1 and o (z) 
is a local approximation near the singularity; usually o is a much simpler function, 
typically like (1 — z)“ log’ (1 — z) whose coefficients are comparatively easy to esti- 
mate (Chapter VI). The relation (4) expresses a mapping between asymptotic scales 
of functions near singularities and asymptotics scales of coefficients. Under suitable 
conditions, it then suffices to estimate a function locally at a few special points (sin- 
gularities), in order to estimate its coefficients asymptotically. 
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A succinct roadmap. Here is what now awaits the reader. Section IV.2 serves 
to introduce basic notions of complex function theory. Singularities and exponential 
growth of coefficients are examined in Section IV. 3, which justifies the First Principle. 
Next, in Section IV.4, we establish the computability of exponential growth rates 
for all the non-recursive structures that are specifiable. Section IV.5 presents two 
important theorems that deal with rational and meromorphic functions and illustrate 
the Second Principle, in its simplest version (the subexponential factors are merely 
polynomials). Then, Section IV. 6 examines constructively ways to locate singularities 
and treats in detail the case of patterns in words. Finally, Section IV.7 shows how 
functions only known through a functional equation may be accessible to complex 
asymptotic methods. 


> IV.1. Euler, the discrete, and the continuous. Eulers’s proof of the existence of infinitely 
many prime numbers illustrates in a striking manner the way analysis of generating functions 
can inform us on the discrete realm. Define, for real s > 1 the function 


Sl 
c(s) = bs ns’ 
n=1 


known as the Riemann zeta function. The decomposition (p ranges over the prime numbers 
2,335,045) 


1 1 1 1 1 1 
(O=(4 gt at \(itgt at \tgtat) 


° anes) 


expresses precisely the fact that each integer has a unique decomposition as a product of primes. 
Analytically, the identity (5) is easily checked to be valid for all s > 1. Now suppose that there 


were only finitely many primes. Let s tend to 1+ in (5). Then, the left-hand side becomes 
infinite, while the right-hand side tends to the finite limit |] pil -—1/ py: a contradiction has 
been reached. dq 


> IV.2. Elementary transfers. Elementary series manipulation yield the following general re- 
sult: Let h(z) be a power series with radius of convergence > 1 and assume that h(1) 4 0; then 
one has 


i any: iia, [e"hG@)log —— ~ “O. 


oe 2x3 n 
See our discussion on p. 434 and Bender’s survey [36] for many similar statements, of which 
this chapter and Chapter VI provide many far-reaching extensions. dq 


> IV.3. Asymptotics of generalized derangements. The EGF of permutations without cycles of 
length 1 and 2 satisfies (p. 123) 


ewe /2 07 3/2 
I= with 
1-—z zl 


l-<z’ 

Analogy with derangements suggests that [z’]j (z) ary gk, [For a proof, use Note IV.2 or 
noo 

refer to Example IV.9 below, p. 261.] Here is a table of exact values of [z”]j(z) (with relative 


error of the approximation by e—3/2 in parentheses): 
n=5 n= 10 n= 20 n=50 
In: 0.2 0.22317 = 0.2231301600 —0.2231301601484298289332804707640122 


error: (107!) (2.1074) (3. 107!) (10733) 
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The quality of the asymptotic approximation is extremely good, such a property being, as we 
shall see, invariably attached to polar singularities. 


IV.2. Analytic functions and meromorphic functions 


Analytic functions are a primary mathematical concept of asymptotic theory. They 
can be characterized in two essentially equivalent ways (see Subsection IV. 2.1): by 
means of convergent series expansions (a la Cauchy and Weierstrass) and by differ- 
entiability properties (a la Riemann). The first aspect is directly related to the use of 
generating functions for enumeration; the second one allows for a powerful abstract 
discussion of closure properties that usually requires little computation. 

Integral calculus with analytic functions (see Subsection IV. 2.2) assumes a shape 
radically different from that which prevails in the real domain: integrals become 
quintessentially independent of details of the integration contour—certainly the prime 
example of this fact is Cauchy’s famous residue theorem. Conceptually, this indepen- 
dence makes it possible to relate properties of a function at a point (e.g., the coeffi- 
cients of its expansion at 0) to its properties at another far-away point (e.g., its residue 
at a pole). 

The presentation in this section and the next one constitutes an informal review 
of basic properties of analytic functions tuned to the needs of asymptotic analysis of 
counting sequences. The entry in Appendix B.2: Equivalent definitions of analyticity, 
p. 741, provides further information, in particular a proof of the Basic Equivalence 
Theorem, Theorem IV.1 below. For a detailed treatment, we refer the reader to one 
of the many excellent treatises on the subject, such as the books by Dieudonné [165], 
Henrici [329], Hille [334], Knopp [373], Titchmarsh [577], or Whittaker and Wat- 
son [604]. The reader previously unfamiliar with the theory of analytic functions 
should essentially be able to adopt Theorems IV.1 and IV.2 as “axioms” and start from 
here using basic definitions and a fair knowledge of elementary calculus. Figure IV.19 
at the end of this chapter (p. 287) recapitulates the main results of relevance to Analytic 
Combinatorics. 


IV. 2.1. Basics. We shall consider functions defined in certain regions of the 
complex domain C. By a region is meant an open subset Q of the complex plane 
that is connected. Here are some examples: 


a | v7 y v7 y 
aa t t fix 
q as t “7 t , \ ’ 
Pe 1 1 1 
1 { { & . : q A 
Pp ? 1 ~ ’ N y 1 
\  — 77 \ A \ — A 
. “-~w) mar >. a b ¢ 
\ bel See 
SN eae v7 
simply connected domain slit complex plane indented disc annulus. 


Classical treatises teach us how to extend to the complex domain the standard 
functions of real analysis: polynomials are immediately extended as soon as complex 
addition and multiplication have been defined, while the exponential is definable by 
means of Euler’s formula. One has for instance 


2? = (x? — y*) + Dixy, e =e*cosy+ie’ siny, 
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if z = x + iy, that is, x = R(z) and y = S(z) are the real and imaginary parts of z. 
Both functions are consequently defined over the whole complex plane C. 

The square-root and logarithm functions are conveniently described in polar co- 
ordinates: 


(6) Jz = Jpel?/?, logz =logp +19, 


if = pe’?. One can take the domain of validity of (6) to be the complex plane slit 
along the axis from 0 to —oo, that is, restrict 6 to the open interval (—z, +7), in which 
case the definitions above specify what is known as the principal determination. There 
is no way for instance to extend by continuity the definition of ,/z in any domain 
containing 0 in its interior since, fora > 0 and z > —a, one has /z > i/a as 
z — —a from above, whereas ./z > —i./a as z > —a from below. This situation is 
depicted here: 


The values of ,/z 
Fea es See etapa as z varies along |z| =a. 


The point z = 0, where several determinations “meet”, is accordingly known as a 
branch point. 


Analytic functions. First comes the main notion of an analytic function that 
arises from convergent series expansions and is of obvious relevance to generating- 
functionology. 


Definition IV.1. A function f (z) defined over a region Q is analytic at a point zo € Q 
if, for z in some open disc centred at zo and contained in Q, it is representable by a 
convergent power series expansion 


(7) f() = Dien — 20)”. 


n>0 
A function is analytic in a region Q. iff it is analytic at every point of Q. 


As derived from an elementary property of power series (Note IV.4), given a 
function f that is analytic at a point zo, there exists a disc (of possibly infinite radius) 
with the property that the series representing f(z) is convergent for z inside the disc 
and divergent for z outside the disc. The disc is called the disc of convergence and 
its radius is the radius of convergence of f(z) at z = zo, which will be denoted by 
Reonv(f; Zo). The radius of convergence of a power series conveys basic information 
regarding the rate at which its coefficients grow; see Subsection IV.3.2 below for 
developments. It is also easy to prove by simple series rearrangement that if a function 
is analytic at zg, it is then analytic at all points interior to its disc of convergence 
(see Appendix B.2: Equivalent definitions of analyticity, p. 741). 


> IV.4. The disc of convergence of a power series. Let f(z) = >> fnz” be a power series. 
Define R as the supremum of all values of x > 0 such that {f,x”} is bounded. Then, for 
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|z| < R, the sequence f,z” tends geometrically to 0; hence f(z) is convergent. For |z| > R, 
the sequence f,z” is unbounded; hence f(z) is divergent. In short: a power series converges 
in the interior of a disc; it diverges in its exterior. dq 

Consider for instance the function f(z) = 1/(1 — z) defined over C \ {1} in the 
usual way via complex division. It is analytic at 0 by virtue of the geometric series 


sum, 
1 
ae 1-2” 
l-z >. a 
n>0 
which converges in the disc |z| < 1. At a point zo 4 1, we may write 
1 _ 1 ee! 1 
I-z — 1l-~z-@-2w) = 1-zw1- 
—Z0 
(8) io Ra 
- >( ) (— 20)". 
1 — zp 


n>0 


The last equation shows that f(z) is analytic in the disc centred at zo with radius 
|1 — zo], that is, the interior of the circle centred at zg and passing through the point 1. 
In particular Reonv(f, Zo) = |1 — zo| and f(z) is globally analytic in the punctured 
plane C \ {1}. 

The example of (1 — z)7! illustrates the definition of analyticity. However, the 
series rearrangement approach that it uses might be difficult to carry out for more 
complicated functions. In other words, a more manageable approach to analyticity is 
called for. The differentiability properties developed now provide such an approach. 


Differentiable (holomorphic) functions. The next important notion is a geomet- 
ric one based on differentiability. 
Definition IV.2. A function f(z) defined over a region Q is called complex-differen- 
tiable (also holomorphic) at zo if the limit, for complex 6, 
_ £@o +4) — fo) 
lim 
6-0 ra) 


exists. (In particular, the limit is independent of the way 6 tends to 0 in C.) This 
limit is denoted as usual by f'(zo), or £ f (2) be or 0, f (Zo). A function is complex- 
differentiable in Q iff it is complex-differentiable at every zy € Q. 
From the definition, if f(z) is complex-differentiable at zo and f’(zo) 4 0, it acts 
locally as a linear transformation: 
f(2) — f @o) = f’o)@ — 20) +O@—2Z)  &— 20). 


Then, f(z) behaves in small regions almost like a similarity transformation (composed 
of a translation, a rotation, and a scaling). In particular, it preserves angles? and 
infinitesimal squares get transformed into infinitesimal squares; see Figure IV.3 for a 
rendering. Further aspects of the local shape of an analytic function will be examined 
in Section VIII 1, p. 543, in relation with the saddle-point method. 


24 mapping of the plane that locally preserves angles is also called a conformal map. Section VIIL 1 
(p. 543) presents further properties of the local “shape” of an analytic function. 
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Figure IV.3. Multiple views of an analytic function. The image of the domain Q = 
{z | IR(z)] < 2, |[S(z)| < 2} by f(z) = exp(z) + z + 2: [top] transformation of a 
square grid in Q by f; [bottom] the modulus and argument of f(z). 


For instance the function ,/z, defined by (6) in the complex plane slit along the 
ray (—oo, 0), is complex-differentiable at any zo of the slit plane since 


ifn ee = Ries ior ome | 
(9) Se Os Ff 


630 6 630 6 2./Z0° 

which extends the customary proof of real analysis. Similarly, “1 — z is complex- 
differentiable in the complex plane slit along the ray (1, +00). More generally, the 
usual proofs from real analysis carry over almost verbatim to the complex realm, to 
the effect that 


1\/ pe 
(ftsy=f'ts', (fe) =f'etfe’, () =~" 
The notion of complex differentiability is thus much more manageable than the notion 
of analyticity. 
It follows from a well known theorem of Riemann (see for instance [329, vol. 1, 


p 143] and Appendix B.2: Equivalent definitions of analyticity, p. 741) that analyticity 
and complex differentiability are equivalent notions. 


(fog) =(f'og)g’. 


Theorem IV.1 (Basic Equivalence Theorem). A function is analytic in a region Q if 
and only if it is complex-differentiable in Q. 


The following are known facts (see p. 236 and Appendix B): (/) if a function 
is analytic (equivalently complex-differentiable) in ©, it admits (complex) deriva- 
tives of any order there—this property markedly differs from real analysis: complex- 
differentiable, equivalently analytic, functions are all smooth; (ii) derivatives of a 
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function may be obtained through term-by-term differentiation of the series represen- 
tation of the function. 


Meromorphic functions. We finally introduce meromorphic? functions that are 
mild extensions of the concept of analyticity (or holomorphy) and are essential to 
the theory. The quotient of two analytic functions f(z)/g(z) ceases to be analytic 
at a point a where g(a) = 0; however, a simple structure for quotients of analytic 
functions prevails. 


Definition IV.3. A function h(z) is meromorphic at zo iff, for z in a neighbourhood of 
Zo with z Zo, it can be represented as f (z)/g(z), with f(z) and g(z) being analytic 
at zg. In that case, it admits near zg an expansion of the form 


(10) h@) = >) Mae 20)". 


n>—M 


Ifh_y 4 0and M > 1, then h(z) is said to have a pole of order M at z = Zo. The 
coefficient h_ is called the residue of h(z) at z = zg and is written as 


Res[h(z); z = zo]. 


A function is meromorphic in a region iff it is meromorphic at every point of the region. 


IV. 2.2. Integrals and residues. A path in a region Q is described by its pa- 
rameterization, which is a continuous function y mapping [0, 1] into Q. Two paths 
y, y’ in Q that have the same end points are said to be homotopic (in Q) if one can 
be continuously deformed into the other while staying within © as in the following 


examples: 
homotopic paths: 


A closed path is defined by the fact that its end points coincide: y (0) = y (1), anda 
path is simple if the mapping y is one-to-one. A closed path is said to be a loop of 
Q. if it can be continuously deformed within Q to a single point; in this case one also 
says that the path is homotopic to 0. In what follows paths are taken to be piecewise 
continuously differentiable and, by default, oops are oriented positively. 


Integrals along curves in the complex plane are defined in the usual way as curvi- 
linear integrals of complex-valued functions. Explicitly: let f(x + iy) be a function 


3“Holomorphic” and “meromorphic” are words coming from Greek, meaning, respectively, “of com- 
plete form” and “of partial form”. 
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and y be a path; then, 


i}: 
/ fOdx = i Fo W)y'@at 
i Ai 1 
= [ cq sorarsi | [AD + BC] dt, 
0 0 


where f oy =A+iB and y’ =C +iD. However, integral calculus in the complex 
plane greatly differs from its form on the real line—in many ways, it is much simpler 
and much more powerful. One has: 


Theorem IV.2 (Null Integral Property). Let f be analytic in Q and let 2 be a simple 
loop of Q. Then, one has f f =0. 


Equivalently, integrals are largely independent of details of contours: for f analytic 
in Q, one has 


(11) [safe 


provided y and y’ are homotopic (not necessarily closed) paths in Q. A proof of The- 
orem IV.2 is sketched in Appendix B.2: Equivalent definitions of analyticity, p. 741. 


Residues. The important Residue Theorem due to Cauchy relates global prop- 
erties of a meromorphic function (its integral along closed curves) to purely local 
characteristics at designated points (its residues at poles). 


Theorem IV.3 (Cauchy’s residue theorem). Let h(z) be meromorphic in the region Q 
and let A be a positively oriented simple loop in Q along which the function is analytic. 
Then 


1 
aig |, hoa 2 Reale; z=s], 


where the sum is extended to all poles s of h(z) enclosed by 2. 


Proof. (Sketch) To see it in the representative case where h(z) has only a pole at 
z = 0, observe by appealing to primitive functions that 


gant dz 
h(z)dz = hy| —~| +hE =, 
i: Y 2 Lesa, "hy z 


n>—-M 
n#-1 

where the bracket notation [u (z)] , designates the variation of the function u(z) along 

the contour 1. This expression reduces to its last term, itself equal to 2imh_1, as is 

checked by using integration along a circle (set z = re’’). The computation extends 

by translation to the case of a unique pole at z = a. 

Next, in the case of multiple poles, we observe that the simple loop can only 
enclose finitely many poles (by compactness). The proof then follows from a simple 
decomposition of the interior domain of 1 into cells, each containing only one pole. 
Here is an illustration in the case of three poles. 
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(Contributions from internal edges cancel.) | 


Global (integral) to local (residues) connections. Here is a textbook example of 
a reduction from global to local properties of analytic functions. Define the integrals 


: co dx 
n= foe 


and consider specifically 1;. Elementary calculus teaches us that 7; = z since the 
antiderivative of the integrand is an arc tangent: 


CO 
ih =f =n = [arctan x]*% =z. 
oo 1+ x? °8 
Here is an alternative, and in many ways more fruitful, derivation. In the light 
of the residue theorem, we consider the integral over the whole line as the limit of 
integrals over large intervals of the form [—R,+R], then complete the contour of 
integration by means of a large semi-circle in the upper half-plane, as shown below: 


—R 0 +R 


Let y be the contour comprised of the interval and the semi-circle. Inside y, the 
integrand has a pole at x = 7, where 
ee = 1 il 
l+x20 («+i(Q—-i) 2x-i 
so that its residue there is —i/2. By the residue theorem, the integral taken over y is 
equal to 2iz times the residue of the integrand ati. As R — ov, the integral along 
the semi-circle vanishes (it is less than z R/(R? — 1) in modulus), while the integral 
along the real segment gives J; in the limit. There results the relation giving /,: 


' 
I = 2ix Res (ss om ') = (im) (-3) = 


The evaluation of the integral in the framework of complex analysis rests solely 
upon the local expansion of the integrand at special points (here, the point i). This is a 
remarkable feature of the theory, one that confers it much simplicity, when compared 
with real analysis. 


“9 
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it 


3) SO that a2” — —1, Contour integration of 


> IV.5. The general integral I. Let a = exp( 
the type used for J; yields 


m 
I 
Im = 2ia > Res (a a 2i-') > 
j=l 


x 


2j-! with 1 < j <™m, one has 


1 1 1 Bol 
14+ x2" xp 2mp2"-l1x-B mx-fP 


while, for any 8 = a 


As a consequence, 
in T 
om = ——= (a +03 +---+02-1) = ay hy 
m msin 55 
In particular, = 2/V2, 1; = 22/3, Ig = - 2/2 + /2, and 146, 116 are expressible by 
radicals, but 1h, 114 are not. The special cases 117, 1 sq are expressible by radicals. <] 


> IV.6. Integrals of rational fractions. Generally, all integrals of rational functions taken over 
the whole real line are computable by residues. In particular, 


j es dx 7 os dx 
ae eh. HE x2)m > sae eee (12 + x2)(22 ae x2) — (m2 ae x2) 
can be explicitly evaluated. dq 


Cauchy’s coefficient formula. Many function-theoretic consequences are derived 
from the residue theorem. For instance, if f is analytic in Q, zo € Q, and / is a simple 
loop of Q encircling zo, one has 


1 

(12) fa= = | FO 
Ww J) 

This follows directly since 


Res | f (©)/(€ — 20); € = Zol = f Zo). 


Then, by differentiation with respect to zg under the integral sign, one has similarly 


1 1 d 
(13) =f le) = = / io 


(C = zo)kt! : 
The values of a function and its derivatives at a point can thus be obtained as values of 
integrals of the function away from that point. The world of analytic functions is a very 
friendly one in which to live: contrary to real analysis, a function is differentiable any 
number of times as soon as it is differentiable once. Also, Taylor’s formula invariably 
holds: as soon as f(z) is analytic at zo, one has 


dg 
C— 20° 


1 
(14) fF) = fo) + fo) = 20) + FF" Eo) = 20) +++ 
with the representation being convergent in a disc centred at zg. [Proof: a verification 
from (12) and (13), or a series rearrangement as in Appendix B, p. 742.] 


A very important application of the residue theorem concerns coefficients of ana- 
lytic functions. 
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Theorem IV.4 (Cauchy’s Coefficient Formula). Let f(z) be analytic in a region Q 
containing O and let fe be a simple loop around 0 in Q that is positively oriented. 
Then, the coefficient [z" | f (z) admits the ae i, 


fr =I" = = [OR . 


Proof. This formula follows directly from the equalities 


Din. 57 [fOsn at EiGrag Zs 0| = [2"1f(@), 


of which the first one follows from the residue theorem, and the second one from the 
identification of the residue at 0 as a coefficient. a 


Analytically, the coefficient formula allows us to deduce information about the 
coefficients from the values of the function itself, using adequately pnosen contours of 
integration. It thus opens the possibility of estimating the coefficients [z”] f(z) in the 
expansion of f(z) near 0 by using information on f(z) away from 0. Ht rest of this 
chapter will precisely illustrate this process in the case of rational and meromorphic 
functions. Observe also that the residue theorem provides the simplest proof of the 
Lagrange inversion theorem (see Appendix A.6: Lagrange Inversion, p. 732) whose 
role is central to tree enumerations, as we saw in Chapters I and II. The notes below 
explore some independent consequences of the residue theorem and the coefficient 
formula. 


> IV.7. Liouville’s Theorem. If a function f(z) is analytic in the whole of C and is of modulus 
bounded by an absolute constant, | f(z)| < B, then it must be a constant. [By trivial bounds, 
upon integrating on a large circle, it is found that the Taylor coefficients at the origin of index 
> 1 are all equal to 0.] Similarly, if f(z) is of at most polynomial growth, | f(z)| < B(Iz| +)’, 
over the whole of C, then it must be a polynomial. dq 


> IV.8. Lindeléf integrals. Let a(s) be analytic in R(s) > 1 where it is assumed to satisfy 
a(s) = O(exp((a — 0)|s|)) for some 6 with 0 < 6 < z. Then, one has for | arg(z)| < 6, 


oo 1/2+i00 


Vay ze= <2 a(s)z* < ds, 


aaa 1/2-i00 sinzs 


in the sense that the integral exists and provides the analytic continuation of the sum in | arg(z)| < 
0. [Close the integration contour by a large semi-circle on the right and evaluate by residues.] 
Such integrals, sometimes called Lindel6f integrals, provide representations for many functions 
whose Taylor coefficients are given by an explicit rule [268, 408]. dq 


> IV.9. Continuation of polylogarithms. As a consequence of Lindel6éf’s representation, the 
generalized polylogarithm functions, 


Lig.¢(<) = Dn “(logn)*c” (a ER, ke Zz0), 
n>1 


are analytic in the complex plane C slit along (1+, 00). (More properties are presented in 
Section VI. 8, p. 408; see also [223, 268].) For instance, one obtains in this way 


co 


1 ft lo 1 +1? 
“$5 (-1)" logn” = - ‘| 84 ) at = 0.22579... = log 2. 
4 cosh(z ft) 2 


n=1 ic 


when the divergent series on the left is interpreted as Lig |(—1) = lim,_, _ }+ Lig 1 (z). <q 
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> IV.10. Magic duality. Let ¢ be a function initially defined over the non-negative integers but 
admitting a meromorphic extension over the whole of C. Under growth conditions in the style 
of Note IV.8, the function 


F@):= >) é@)Co", 
n>1 


which is analytic at the origin, is such that, near positive infinity, 


FQ) ~ EO Decca, 


ore 
n>1 


for some elementary function E(z), which is a linear combination of terms of the form z@ (log z)*. 
[Starting from the representation of Note IV.8, close the contour of integration by a large semi- 
circle to the left.] In such cases, the function is said to satisfy the principle of magic duality—its 
expansion at 0 and oo are given by one and the same rule. Functions 


1 
sz «(los( +z), exp(—z), Li2(-z),  Li3(—z), 
1+z 
satisfy a form of magic duality. Ramanujan [52] made a great use of this principle, which 
applies to a wide class of functions including hypergeometric ones; see Hardy’s insightful dis- 
cussion [321, Ch XI]. <i 


> IV.11. Euler—Maclaurin and Abel-Plana summations. Under simple conditions on the ana- 
lytic function f, one has Plana’s (also known as Abel’s) complex variables version of the Euler— 
Maclaurin summation formula: 


[o.@) s 
1 ae © fliy) — f(-i 
Xf =5FO +f foe | eee hey yt ee 
n=0 0 0 € a 
(See [330, p. 274] for a proof and validity conditions.) <i 


> IV.12. Nérlund-Rice integrals. Let a(z) be analytic for R(z) > kg — 5 and of at most 
polynomial growth in this right half-plane. Then, with y a simple loop around the interval 


[ko, n], one has 
n 


" re ee nds 
2 (Joo a) =~ | at) 7 


k=ko 


If a(z) is meromorphic and suitably small in a larger region, then the integral can be estimated 
by residues. For instance, with 


n k n k 
(-1) n) (-1) 
=> (%) > m= > (ie > 
az k k arr kK} ke +1 
it is found that S, = —Hn» (a harmonic number), while T, oscillates boundedly as n > 
+oo. [This technique is a classical one in the calculus of finite differences, going back to 
Norlund [458]. In computer science it is known as the method of Rice’s integrals [256] and 
is used in the analysis of many algorithms and data structures including digital trees and radix 
sort [378, 564].] <q 


IV.3. Singularities and exponential growth of coefficients 


For a given function, a singularity can be informally defined as a point where the 
function ceases to be analytic. (Poles are the simplest type of singularity.) Singu- 
larities are, as we have stressed repeatedly, essential to coefficient asymptotics. This 
section presents the bases of a discussion within the framework of analytic function 
theory. 
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IV. 3.1. Singularities. Let f(z) be an analytic function defined over the interior 
region determined by a simple closed curve y, and let zo be a point of the bounding 
curve y. If there exists an analytic function f*(z) defined over some open set Q* 
containing zo and such that f*(z) = f(z) in Q* MQ, one says that f is analytically 
continuable at zo and that f* is an immediate analytic continuation of f . Pictorially: 


Y 


Analytic continuation: Q f*@) = f@ on &*NQ. 


(f) (f*) 


Consider for instance the quasi-inverse function, f(z) = 1/(1 — z). Its power series 
representation f(z) = >),,592” initially converges in |z| < 1. However, the calcula- 
tion of (8), p. 231, shows that it is representable locally by a convergent series near 
any point zo ~# 1. In particular, it is continuable at any point of the unit disc ex- 
cept 1. (Alternatively, one may appeal to complex-differentiability to verify directly 
that f(z), which is given by a “global” expression, is holomorphic, hence analytic, in 
the punctured plane C \ {1}.) 

In sharp contrast with real analysis, where a smooth function admits of uncount- 
ably many extensions, analytic continuation is essentially unique: if f* (in Q*) and 
f** Gn Q**) continue f at zo, then one must have f*(z) = f**(z) in the intersection 
Q* M O*, which in particular includes a small disc around zo. Thus, the notion of 
immediate analytic continuation at a boundary point is intrinsic. The process can be 
iterated and we say that g is an analytic continuation’ of f along a path, even if the 
domains of definition of f and g do not overlap, provided a finite chain of interme- 
diate function elements connects f and g. This notion is once more intrinsic—this is 
known as the principle of unicity of analytic continuation (Rudin [523, Ch. 16] pro- 
vides a thorough discussion). An analytic function is then much like a hologram: as 
soon as it is specified in any tiny region, it is rigidly determined in any wider region 
to which it can be continued. 

Definition IV.4. Given a function f defined in the region interior to the simple closed 
curve  , a point zo on the boundary (y ) of the region is a singular point or a singularity° 
if f is not analytically continuable at zo. 


Granted the intrinsic character of analytic continuation, we can usually dispense with 
a detailed description of the original domain Q and the curve y. In simple terms, a 
function is singular at zo if it cannot be continued as an analytic function beyond Zp. 
A point at which a function is analytic is also called by contrast a regular point. 

The two functions f(z) = 1/(1 —z) and g(z) = V1 — z may be taken as initially 
defined over the open unit disc by their power series representation. Then, as we 
already know, they can be analytically continued to larger regions, the punctured plane 


‘The collection of all function elements continuing a given function gives rise to the notion of Riemann 
surface, for which many good books exist, e.g., [201, 549]. We shall not need to appeal to this theory. 
5For a detailed discussion, see [165, p. 229], [373, vol. 1, p. 82], or [577]. 
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Q = C \ {1} for f [e.g., by the calculation of (8), p. 231] and the complex plane 
slit along (1, +00) for g [e.g., by virtue of continuity and differentiability as in (9), 
p. 232]. But both are singular at 1: for f, this results (say) from the fact that f(z) > 
oo as z > 1; for g this is due to the branching character of the square-root. Figure IV.4 
displays a few types of singularities that are traceable by the way they deform a regular 
grid near a boundary point. 


A converging power series is analytic inside its disc of convergence; in other 
words, it can have no singularity inside this disc. However, it must have at least one 
singularity on the boundary of the disc, as asserted by the theorem below. In addition, a 
classical theorem, called Pringsheim’s theorem, provides a refinement of this property 
in the case of functions with non-negative coefficients, which happens to include all 
counting generating functions. 


Theorem IV.5 (Boundary singularities). A function f(z) analytic at the origin, whose 
expansion at the origin has a finite radius of convergence R, necessarily has a singu- 
larity on the boundary of its disc of convergence, |z| = R. 


Proof. Consider the expansion 


(15) f@=>o fie". 


n>0 


assumed to have radius of convergence exactly R. We already know that there can 
be no singularity of f within the disc |z| < R. To prove that there is a singularity 
on |z| = R, suppose a contrario that f(z) is analytic in the disc |z| < p for some 
p satisfying p > R. By Cauchy’s coefficient formula (Theorem IV.4, p. 237), upon 
integrating along the circle of radius r = (R + p)/2, and by trivial bounds, it is seen 
that the coefficient [z”] f(z) is O(r~"). But then, the series expansion of f would 
have to converge in the disc of radius r > R, a contradiction. | 


Pringsheim’s Theorem stated and proved now is a refinement of Theorem IV.5 
that applies to all series having non-negative coefficients, in particular, generating 
functions. It is central to asymptotic enumeration, as the remainder of this section will 
amply demonstrate. 


Theorem IV.6 (Pringsheim’s Theorem). [f f(z) is representable at the origin by a 
series expansion that has non-negative coefficients and radius of convergence R, then 
the point z = R is a singularity of f (z). 
> IV.13. Proof of Pringsheim’s Theorem. (See also [577, Sec. 7.21].) In a nutshell, the idea 
of the proof is that if f has positive coefficients and is analytic at R, then its expansion slightly 
to the left of R has positive coefficients. Then, the power series of f would converge in a disc 
larger than the postulated disc of convergence—a clear contradiction. 

Suppose then a contrario that f(z) is analytic at R, implying that it is analytic in a disc of 


radius r centred at R. We choose a number h: such that 0 <h < qr and consider the expansion 
of f(z) around zg = R—-/|A: 


(16) f@= Dd am — 2)”. 


m>0 
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1 
fo) = i= 


fi) = e/O-2%) 


f(z) = -V1 =z f(z) =-A - 23? 


> 


f4(z) = log 


1-z 


Figure IV.4. The images of a grid on the unit square (with corners +1-£/) by various 
functions singular at z = 1 reflect the nature of the singularities involved. Singulari- 
ties are apparent near the right of each diagram where small grid squares get folded 
or unfolded in various ways. (In the case of functions fg, f, f4 that become infinite 
at z = 1, the grid has been slightly truncated to the right.) 


242 IV. COMPLEX ANALYSIS, RATIONAL AND MEROMORPHIC ASYMPTOTICS 


By Taylor’s formula and the representability of f(z) together with its derivatives at z) by means 


of (15), we have 
n = 
&m = 3 (") tach es 


n>0 


and in particular, gm > 0. 
Given the way h was chosen, the series (16) converges at z = R +h (so that z — zy = 2h) 
as illustrated by the following diagram: 


Consequently, one has 


fR+D=> (> (7) fue" (2h). 


m>0 \n>0 


This is a converging double sum of positive terms, so that the sum can be reorganized in any 
way we like. In particular, one has convergence of all the series involved in 


f(R+h) = > (7) cr = myrreny” 
m,n>0 
= >) fal(R-h) + Qayy" 
n>0 
= Do i(k +h)", 
n>0 


This establishes the fact that f, = o((R + h)~"), thereby reaching a contradiction with the as- 
sumption that the series representation of f has radius of convergence exactly R. Pringsheim’s 
theorem is proved. dq 

Singularities of a function analytic at 0, which lie on the boundary of the disc of 
convergence, are called dominant singularities. Pringsheim’s theorem appreciably 
simplifies the search for dominant singularities of combinatorial generating functions 
since these have non-negative coefficients—it is sufficient to investigate analyticity 
along the positive real line and detect the first place at which it ceases to hold. 


Example IV.1.. Some combinatorial singularities. The derangement and the surjection EGFs, 


—Zz 


e m 
D@=-—, R@=0-ey! 
1—z 
are analytic, except for a simple pole at z = 1 in the case of D(z), and for points x, = 


log 2 + 2ikz that are simple poles in the case of R(z). Thus the dominant singularities for 
derangements and surjections are at 1 and log 2, respectively. 
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It is known that /Z cannot be unambiguously defined as an analytic function in a neigh- 
bourhood of Z = 0. As a consequence, the function 


(avira 
) > 
which is the generating function of general Catalan trees, is an analytic function in regions that 
must exclude 1/4; for instance, one may take the complex plane slit along the ray (1/4, +00). 
The OGF of Catalan numbers C(z) = G(z)/z is, as G(z), a priori analytic in the slit plane, 
except perhaps at z = 0, where it has the indeterminate form 0/0. However, after C(z) is 
extended by continuity to C(O) = 1, it becomes an analytic function at 0, where its Taylor 
series converges in |z| < i In this case, we say that that C(z) has an apparent or removable 
singularity at 0. (See also Morera’s Theorem, Note B.6, p. 743.) 
Similarly, the EGF of cyclic permutations 


G(z) = 


1 
L(z) = log i 


is analytic in the complex plane slit along (1, +00). 
A function having no singularity at a finite distance is called entire; its Taylor series then 
converges everywhere in the complex plane. The EGFs, 


ee] 


2 
ete" /2 and e ; 


associated, respectively, with involutions and set partitions, are entire. .................-. | 


IV.3.2. The Exponential Growth Formula. We say that a number sequence 
{an} is of exponential order K" , which we abbreviate as (the symbol rx is a “bowtie’”’) 


a, K" iff limsup|a,|'/" = K. 


The relation “a, > K”” reads as “ay is of exponential order K"”. It expresses both 
an upper bound and a lower bound, and one has, for any € > 0: 
(i) |an| >io (K — €)"; that is to say, |an| exceeds (K — €)” infinitely often (for 
infinitely many values of 7); 
(ii) |an| <ae, (K +)”; that is to say, |a,| is dominated by (K + €)” almost 
everywhere (except for possibly finitely many values of 7). 


This relation can be rephrased as ay = K"6(n), where @ is a subexponential factor : 
lim sup |@(n)|!/" = 1; 


such a factor’s modulus is thus bounded from above almost everywhere by any in- 
creasing exponential (of the form (1 + €)”) and bounded from below infinitely often 
by any decaying exponential (of the form (1 — €)”). Typical subexponential factors 


are 
1 


Sogn’ 
(Functions such as eV” and exp(log” n) are also to be treated as subexponential factors 
for the purpose of this discussion.) The lim sup definition also allows in principle for 
factors that are infinitely often very small or 0, such as n? sinn 5, logn cos Jn 5, and 


so on. In this and the next chapters, we shall develop systematic methods that enable 
one to extract such subexponential factors from generating functions. 


, n°, (logn)*, Jn, n~’*, (-1)", loglogn. 
| 3, (logn)?, /n 3/2 1)", loglog 
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It is an elementary observation that the radius of convergence of the series rep- 
resentation of f(z) at 0 is related to the exponential growth rate of the coefficients 
Stn = [2] f @). To wit, if Roonv (f; 0) = R, then we claim that 


| n 
(17) fn & (=) : ie, fn = R"O(n) with limsup|O(n)|/" = 1. 


> IV.14. Radius of convergence and exponential growth. This only requires the basic definition 
of a power series. (i) By definition of the radius of convergence, we have for any small € > 0, 


fn(R — ©)" = O. In particular, | fy|(R — ©)” < 1 for all sufficiently large n, so that | fy|!/" < 
(R —e)—! “almost everywhere”. (ii) In the other direction, for any € > 0, | fn|(R +6)” cannot 
be a bounded sequence, since otherwise, >°,, | fn|(R + €/2)” would be a convergent series. 


Thus, | fy|!/" > (R + €)7! “infinitely often”. dq 
A global approach to the determination of growth rates is desirable. This is made 
possible by Theorem IV.5, p. 240, as shown by the following statement. 


Theorem IV.7 (Exponential Growth Formula). Jf f(z) is analytic at 0 and R is the 
modulus of a singularity nearest to the origin in the sense that 


R:= sup { r >0 | f is analytic in |z| Zr}, 


then the coefficient f, = [z"|f (z) satisfies 


| n 
fn (g) 


For functions with non-negative coefficients, including all combinatorial generating 
functions, one can also adopt 


R= sup {r >0 | f is analytic at all points of 0 < z <r}. 


Proof. Let R be as stated. We cannot have R < Reonv(f; 0) since a function is analytic 
everywhere in the interior of the disc of convergence of its series representation. We 
cannot have R > Reonv(f;0) by the Boundary Singularity Theorem. Thus R = 
Reonv(f; 0). The statement then follows from (17). The adaptation to non-negative 
coefficients results from Pringsheim’s theorem. | 


The exponential growth formula thus directly relates the exponential growth of 
coefficients of a function to the /ocation of its singularities nearest to the origin. This 
is precisely expressed by the First Principle of Coefficient Asymptotics (p. 227), which, 
given its importance, we repeat here: 


First Principle of Coefficient Asymptotics. The location of a function’s 
singularities dictates the exponential growth (A”) of its coefficient. 


Example IV.2._ Exponential growth and combinatorial enumeration. Here are a few immediate 
applications of exponential bounds. 


Surjections. The function 
R@) = Q-e)! 


One should think of the process defining R as follows: take discs of increasing radii r and stop as 
soon as a singularity is encountered on the boundary. (The dual process that would start from a large disc 
and restrict its radius is in general ill-defined—think of ./1 — z.) 
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n 7 log rn a log r; 

10 0.33385  —0.22508 
20 0.35018 —0.18144 
50 0.35998 = —0.154449 
100 0.36325 —0.145447 
co §=—- 0.36651 = — 0.13644 


(log1/p) (og(/p*) 


Figure IV.5. The growth rate of simple and double surjections. 


is the EGF of surjections. The denominator is an entire function, so that singularities may 
only arise from its zeros, to be found at the points 7, = log2 + 2ika, k € Z. The dominant 
singularity of R is then at p = yo = log 2. Thus, with r, = [z"]R(z), 


1 n 
tn be (x5) , 


Similarly, if “double” surjections are considered (each value in the range of the surjection 
is taken at least twice), the corresponding EGF is 


1 
R*(z) = ——_—., 
= Ga ae 
with the counts starting as 1,0,1,1,7,21,141 (E7S A032032). The dominant singularity is at 
p* defined as the positive root of equation e? — p* = 2, and the coefficient r, Satisfies: 


rp; & (1/p*)” Numerically, this gives 
Tn bx 1.44269” and rp >< 0.87245", 


with the actual figures for the corresponding logarithms being given in Figure IV.5. 

These estimates constitute a weak form of a more precise result to be established later in 
this chapter (p. 260): If random surjections of size n are considered equally likely, the probabil- 
ity of a surjection being a double surjection is exponentially small. 


Derangements. For the cases dy, = [z” Je“ (1 —z)7! and aan = [2M]e“2-27 24 —z)71, 
we have, from the poles at z = 1, 


din bd i? and don ba 18: 


The implied upper bound is combinatorially trivial. The lower bound expresses that the prob- 
ability for a random permutation to be a derangement is not exponentially small. For d_,, we 


1. in the 
~3/2. 


have already proved (p. 225) by an elementary argument the stronger result d} , > e~ 
case of dz ,, we shall establish later (p. 261) the precise asymptotic estimate d7 , — e 


Unary-binary trees. The expression 
1—z-—Vv1-2z - 32? 
2z 
represents the OGF of (plane unlabelled) unary—binary trees. From the equivalent form, 


z—-VC — 3z)0 +2) 
2z ° 


U@)= =zt 242344749 4--, 


U(z) = i 
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it follows that U(z) is analytic in the complex plane slit along G; +00) and (—oo, —1) and is 
singular at z = —1 and z = 1/3 where it has branch points. The closest singularity to the origin 
being at 5, one has 

Uy, > 3". 


In this case, the stronger upper bound U, < 3” results directly from the possibility of encoding 
such trees by words over a ternary alphabet using Lukasiewicz codes (Chapter I, p. 74). A 
complete asymptotic expansion will be obtained, as one of the first applications of singularity 
analysis, in Chapter VI (p. 396). 10.0... cece cece cnet ene teen ene eee | 


> IV.15. Coding theory bounds and singularities. Let C be a combinatorial class. We say that 
it can be encoded with f (n) bits if, for all sufficiently large values of n, elements of C, can be 
encoded as words of f(n) bits. (An interesting example occurs in Note 1.23, p. 53.) Assume 
that C has OGF C(z) with radius of convergence R satisfying 0 < R < 1. Then, for any e, 
C can be encoded with (1 + €)xn bits where x = — logy R, but C cannot be encoded with 
(1 — €)xn bits. 

Similarly, if C has EGF C(z) with radius of convergence R satisfying 0 < R < oo, thenC 
can be encoded with n log(n/e) + (1 + €)«n bits where x = — log R, but C cannot be encoded 
with n log(n/e) + (1 — €)xn bits. Since the radius of convergence is determined by the distance 
to singularities nearest to the origin, we have the following interesting fact: singularities convey 
information on optimal codes. dq 


Saddle-point bounds. The exponential growth formula (Theorem IV.7, p. 244) 
can be supplemented by effective upper bounds which are very easy to derive and 
often turn out to be surprisingly accurate. We state: 


Proposition IV.1 (Saddle-point bounds). Let f(z) be analytic in the disc |z| < R with 
0< R < ~w. Define M(f;r) forr € (0, R) by M(f3r) = sup)j=, | f(@)|. Then, 
one has, for any r in (O, R), the family of saddle-point upper bounds 


as) wir@ < M2 — impiying enp@ < ing “LE” 
r re(0,R) 

If in addition f (z) has non-negative coefficients at 0, then 

ay ef < 22 impiying temp < ine, £°. 
r? re(0,R) r” 


Proof. In the general case of (18), the first inequality results from trivial bounds ap- 
plied to the Cauchy coefficient formula, when integration is performed along a circle: 


1 d 
kYO=5-/  fOsa 


Iel=r 
It is consequently valid for any r smaller than the radius of convergence of f at 0. The 
second inequality in (18) plainly represents the best possible bound of this type. 
In the positive case of (19), the bounds can be viewed as a direct specialization 
of (18). (Alternatively, they can be obtained in a straightforward manner, since 
| ee Lae ever eae are a 


ke r r™ 


whenever the f; are non-negative.) | 
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Note that the value s that provides the best bound in (19) can be determined by 
setting a derivative to zero, 


f(s) _ 
FO) 


Thanks to the universal character of the first bound, any approximate solution of this 
last equation will in fact provide a valid upper bound. 


(20) s 


We shall see in Chapter VIII another way to conceive of these bounds as a first 
step in an important method of asymptotic analysis; namely, the saddle-point method, 
which explains where the term “saddle-point bound” originates from (Theorem VIII.2, 
p. 547). For reasons that are well developed there, the bounds usually capture the 
actual asymptotic behaviour up to a polynomial factor. A typical instance is the weak 
form of Stirling’s formula, 


n! ~ nn? 
which only overestimates the true asymptotic value by a factor of /27n. 


> IV.16. A suboptimal but easy saddle-point bound. Let f(z) be analytic in |z| < 1 with 


non-negative coefficients. Assume that f(x) < (1 — x) F for some £ > 0 and all x € (0, 1). 
Then 


I" f(z) = O(n"). 


(Better bounds of the form O(n —!) are usually obtained by the method of singularity analysis 
expounded in Chapter VI.) dq 


Example IV.3.. | Combinatorial examples of saddle-point bounds. Here are applications to 
fragmented permutations, set partitions (Bell numbers), involutions, and integer partitions. 


Fragmented permutations. First, fragmented permutations (Chapter II, p. 125) are labelled 
structures defined by F = SET(SEQ>1(Z)). The EGF is e?/(—2); we claim that 


Fy = [atle!OM9 < 20 $4001), 


1 
(21) a7 
n! 


Indeed, the minimizing radius of the saddle-point bound (19) is s such that 


0 d Ss 1 1 n 
= — —nlogs } = ——; - -. 
ds \l-s . (-s)2 s 


The equation is solved by s = (2n4+-1—./4n + 1)/(2n). One can either use this exact value and 
compute an asymptotic approximation of f(s)/s”, or adopt right away the approximate value 
sy = 1-—1/,/n, which leads to simpler calculations. The estimate (21) results. It is off from 
the actual asymptotic value only by a factor of order n—3/4 (of Example VIII.7, p. 562). 


Bell numbers and set partitions. Another immediate application is an upper bound on 
Bell numbers enumerating set partitions, S = SET(SETs;(Z)), with EGF e®—!. According 
to (20), the best saddle-point bound is obtained for s such that se’ = n. Thus, 


1 ‘ 
(22) Sn S ef —lonloss where 5 : se’ =n; 
Nn: 


additionally, one has s = logn — loglogn + o(loglogn). See Chapter VIII, p. 561 for the 
complete saddle-point analysis. 
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n Tn Th 

100 0.106579- 108 0.240533 - 1083 =! 

200 0.231809 - 10!95 0.367247 - 10193 

300 0.383502- 10316 0.494575 - 103!4 

400 0.869362 - 10444 0.968454 - 10442 == 

500 0.425391- 10°78 0.423108 - 10°76 0 1 2 3 


Figure IV.6. A comparison of the exact number of involutions J, to its approxi- 
mation I, = nlevntn/2y,—n/2, [left] a table; [right] a plot of logigUn/In) against 
logjg 7 suggesting that the ratio satisfies In/Tn ~ K-n7'/2, the slope of the curve 


being © —5. 


Involutions. Involutions are specified by Z = SET(CYC;,2(Z)) and have EGF J(z) = 
exp(z + 52°). One determines, by choosing s = ,/n as an approximate solution to (20): 


1 evntn/2 
(23) an < a 
(See Figure IV.6 for numerical data and Example VIII.5, p. 558 for a full analysis.) Similar 
bounds hold for permutations with all cycle lengths < k and permutations o such that ok =1d. 


Integer partitions. The function 


[o.@) 1 oO zf 
24 P(z)= = = 
(24) (2) aie: exp Dtiw 


is the OGF of integer partitions, an unlabelled analogue of set partitions. Its radius of con- 
vergence is a priori bounded from above by 1, since the set P is infinite and the second form 
of P(z) shows that it is exactly equal to 1. Therefore P, >< 1”. A finer upper bound results 
from the estimate (see also p. 576) 

2 


1 t 1 
25 L(t) := log P(e!) ~ — +log,/—— — —14 O(??), 
(25) (t) := log P(e") Are = ar ee) 


which is obtained from Euler—Maclaurin summation or, better, from a Mellin analysis follow- 
ing Appendix B.7: Mellin transform, p. 762. Indeed, the Mellin transform of L is, by the 
harmonic sum rule, 


L*(s) = C(s)e(s + YT (s), s € (1, +00), 


and the successive left-most poles at s = 1 (simple pole), s = 0 (double pole), and s = —1 
(simple pole) translate into the asymptotic expansion (25). When z > 17, we have 


m2 
v=00(g=5)) 


—n?/12 


(26) PQ@)~ 


2a 
from which we derive (choose s = D./n as an approximate solution to (20)) 
Py < Cn W4et VP, 


for some C > 0. This last bound is once more only off by a polynomial factor, as we shall 
prove when studying the saddle-point method (Proposition VIII.6, p. 578). ............... | 
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> IV.17. A natural boundary. One has P(re!®) —> ooasr > I, for any angle @ that is a 


rational multiple of 27. The points e2/*?/ being dense on the unit circle, the function P(z) 
admits the unit circle as a natural boundary; that is, it cannot be analytically continued beyond 
this circle. dq 


IV. 4. Closure properties and computable bounds 


Analytic functions are robust: they satisfy a rich set of closure properties. This 
fact makes possible the determination of exponential growth constants for coefficients 
of a wide range of classes of functions. Theorem IV.8 below expresses computability 
of growth rate for all specifications associated with iterative specifications. It is the 
first result of several that relate symbolic methods of Part A with analytic methods 
developed here. 


Closure properties of analytic functions. The functions analytic at a point z = a 
are closed under sum and product, and hence form a ring. If f(z) and g(z) are ana- 
lytic at z = a, then so is their quotient f(z)/g(z) provided g(a) 4 0. Meromorphic 
functions are furthermore closed under quotient and hence form a field. Such prop- 
erties are proved most easily using complex-differentiability and extending the usual 
relations from real analysis, for instance, (f + g)’ = f’ +8’, (fg) = fe’ + f’g. 

Analytic functions are also closed under composition: if f(z) is analytic at z =a 
and g(w) is analytic at b = f(a), then g o f(z) is analytic at z = a. Graphically: 


CD ed & 


The proof based on complex-differentiability closely mimicks the real case. Inverse 
functions exist conditionally: if f’(a) 4 0, then f(z) is locally linear near a, hence 
invertible, so that there exists a g satisfying fog = go f = Id, where Id is 
the identity function, /d(z) = z. The inverse function is itself locally linear, hence 
complex-differentiable, hence analytic. In short: the inverse of an analytic function f 
at a place where the derivative does not vanish is an analytic function. We shall return 
to this important property later in this chapter (Subsection IV. 7.1, p. 275), then put it 
to full use in Chapter VI (p. 402) and VII (p. 452) in order to derive strong asymptotic 
properties of simple varieties of trees. 
> IV.18. A Mean Value Theorem for analytic functions. Let f be analytic in Q and assume the 
existence of M := sup,eg | f’(z)|. Then, for all a, b in Q, one has 
f(b) — f(a) < 2M|b — al. 

(Hint: a simple consequence of the Mean Value Theorem applied to K(f), 3(f).) dq 
> IV.19. The analytic inversion lemma. Let f be analytic on Q 3 zg and satisfy f’(zo) 4 0. 
Then, there exists a small region Q] C QO containing zp anda C > O such that | f(z) — f(z’)| > 
Clz — 2’ |, for all z, 2’ € Q,z 4 z. Consequently, f maps bijectively Q on f(Q;). (See also 
Subsection IV. 6.2, p. 269, for a proof based on integration.) dq 

One way to establish closure properties, as suggested above, is to deduce analyt- 
icity criteria from complex differentiability by way of the Basic Equivalence Theorem 
(Theorem IV.1, p. 232). An alternative approach, closer to the original notion of ana- 
lyticity, can be based on a two-step process: (i) closure properties are shown to hold 
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true for formal power series; (ii) the resulting formal power series are proved to be 
locally convergent by means of suitable majorizations on their coefficients. This is the 
basis of the classical method of majorant series originating with Cauchy. 

> IV.20. The majorant series technique. Given two power series, define f(z) x g(z) if 
|[z1f(z)| < [e"]g(z) for all n > 0. The following two conditions are equivalent: (i) f(z) 
is analytic in the disc |z| < p; (ii) for any r > po! there exists a c such that 

{os 
If f, g are majorized by c/(1 —rz), d/(1 —rz), respectively, then f + g and f - g are majorized, 


c+td e 


1 ’ f@)-g(z) x 
'Z 


for any s > r and for some e dependent on s. Similarly, the composition f o g is majorized: 


f@+s%) a fa 
= SZ 


is 
<x ——____.. 

POE) Se ocae 

Constructions for 1/f and for the functional inverse of f can be similarly developed. See 

Cartan’s book [104] and van der Hoeven’s study [587] for a systematic treatment. 

As a consequence of closure properties, for functions defined by analytic expres- 
sions, singularities can be determined inductively in an intuitively transparent manner. 
If Sing(f) and Zero(f) are, respectively, the set of singularities and zeros of the func- 
tion f, then, due to closure properties of analytic functions, the following informally 
stated guidelines apply. 


Sing(f+g) C  Sing(f)U Sing(g) 
Sing(f x g) C  Sing(f) U Sing(g) 
Sing(f/g) © Sing(f) U Sing(g) U Zero(g) 
Sing(fog) C_ Sing(g)Ug—)(Sing(f)) 
Sing(/f)  C_ Sing(f) U Zero(f) 
Sing(log(f)) C Sing(f) U Zero(f) 
Sing(f-)) CC f(Sing(f)) U f Zero(f’)). 


A mathematically rigorous treatment would require considering multivalued func- 
tions and Riemann surfaces, so that we do not state detailed validity conditions and 
keep for these formulae the status of useful heuristics. In fact, because of Pringsheim’s 
theorem, the search of dominant singularities of combinatorial generating function can 
normally avoid considering the complete multivalued structure of functions, since only 
some initial segment of the positive real half-line needs to be considered. This in turn 
implies a powerful and easy way of determining the exponential order of coefficients 
of a wide variety of generating functions, as we explain next. 


Computability of exponential growth constants. As defined in Chapters I and II, 
a combinatorial class is constructible or specifiable if it can be specified by a finite set 
of equations involving only the basic constructors. A specification is iterative or non- 
recursive if in addition the dependency graph (p. 33) of the specification is acyclic. 
In that case, no recursion is involved and a single functional term (written with sums, 
products, sequences, sets, and cycles) describes the specification. 
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Our interest here is in effective computability issues. We recall that a real number 
a is computable iff there exists a program II,,, which, on input m, outputs a rational 
number a» guaranteed to be within +10~” of a. We state: 


Theorem IV.8 (Computability of growth). Let C be a constructible unlabelled class 
that admits an iterative specification in terms of (SEQ, PSET, MSET, CYC; +, x) 
starting with (1, Z). Then, the radius of convergence pc of the OGF C(z) of C is 
either +00 or a (strictly) positive computable real number. 

Let D be a constructible labelled class that admits an iterative specification in 
terms of (SEQ, SET, CYC; +, *) starting with (1, Z). Then, the radius of convergence 
Pp Of the EGF D(z) of D is either +00 or a (strictly) positive computable real number. 

Accordingly, if finite, the constants pc, pp in the exponential growth estimates, 


[2"]C(z) = Cy > (--) , [2"]D(@) = =D bd (-) , 
pc n! 


PD 
are computable numbers. 


Proof. In both cases, the proof proceeds by induction on the structural specification of 
the class. For each class F, with generating function F(z), we associate a signature, 
which is an ordered pair (pr, tr), where pr is the radius of convergence of F and tr 
is the value of F at pr, precisely, 
Tr = lim F(x). 
X Pp 

(The value tr is well defined as an element of IR U {+00} since F, being a counting 
generating function, is necessarily increasing on (0, pr).) 


Unlabelled case. An unlabelled class G is either finite, in which case its OGF 
G(z) is a polynomial, or infinite, in which case it diverges at z = 1, so that pg < 1. It 
is clearly decidable, given the specification, whether a class is finite or not: a necessary 
and sufficient condition for a class to be infinite is that one of the unary constructors 
(SEQ, MSET, CYC) intervenes in the specification. We prove (by induction) the as- 
sertion of the theorem together with the stronger property that t7 = oo as soon as the 
class is infinite. 

First, the signatures of the neutral class 1 and the atomic class Z, with OGF 1 and 
Z, are (+00, 1) and (+00, +00). Any non-constant polynomial which is the OGF of 
a finite set has the signature (+00, +00). The assertion is thus easily verified in these 
cases. 

Next, let F = SEQ(G). The OGF G(z) must be non-constant and satisfy G(0) = 
0, in order for the sequence construction to be properly defined. Thus, by the induc- 
tion hypothesis, one has 0 < pg < +00 and tg = +00. Now, the function G being 
increasing and continuous along the positive axis, there must exist a value £ such that 
0 < B < pg with G(f) = 1. For z € (0, f), the quasi-inverse F(z) = (1 — G(z))7! 
is well defined and analytic; as z approaches / from the left, F(z) increases un- 
boundedly. Thus, the smallest singularity of F along the positive axis is at f, and 
by Pringsheim’s theorem, one has pr = /. The argument shows at the same time that 
TF = +00. There only remains to check that 6 is computable. The coefficients of 
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G form a computable sequence of integers, so that G(x), which can be well approxi- 
mated via a truncated Taylor series, is an effectively computable number’ if x is itself 
a positive computable number less than pg. Then, binary search provides an effective 
procedure for determining /. 

Next, we consider the multiset construction, F = MSET(G), whose translation 
into OGFs necessitates the Pélya exponential of Chapter I (p. 34): 


F(z) = Exp(G(z)) where Exp(h(z)) := exp (ne + she?) + hte’) Shi +) : 


Once more, the induction hypothesis is assumed for G. If G is a polynomial, then F 
is a rational function with poles at roots of unity only. Thus, pr = | and tr = co 
in that particular case. In the general case of F = MSET(G) with G infinite, we start 
by fixing arbitrarily a number r such thatO <r < pg < 1 and examine F(z) for 
z € (0,r). The expression for F' rewrites as 


Exp(G(z)) = e@ - exp (560 - 50) Bes ‘) 


The first factor is analytic for z on (0, pg) since, the exponential function being entire, 
e© has the singularities of G. As to the second factor, one has G(0) = 0 (in order 
for the set construction to be well-defined), while G(x) is convex for x € [0, r] (since 
its second derivative is positive). Thus, there exists a positive constant K such that 
G(x) < Kx when x € [0,r]. Then, the series 5G(z”) + 5G (z°) +.-.-- has its terms 
dominated by those of the convergent series 

K > 


K 
ad fig eR OB Glen) ake 


By a well-known theorem of analytic function theory, a uniformly convergent sum of 
analytic functions is itself analytic; consequently, 5G(z’) + 7G(z°) +--+ is analytic 
at all z of (0,7). Analyticity is then preserved by the exponential, so that F(z), being 
analytic at z € (0,7) for any r < pg has a radius of convergence that satisfies pr > 
pc. On the other hand, since F(z) dominates termwise G(z), one has pr < pc. Thus 
finally one has pr = pc. Also, tg = +00 implies tp = +00. 

A parallel discussion covers the case of the powerset construction (PSET) whose 
associated functional Exp is a minor modification of the Pélya exponential Exp. The 
cycle construction can be treated by similar arguments based on consideration of 
“Pélya’s logarithm” as F = CyYC(G) corresponds to 


1 
F(z) = Log where Logh(z) = logh(z) + 5 logh(z2) +... 


1 
1 — G(z)’ 
In order to conclude with the unlabelled case, it only remains to discuss the binary 
constructors +, x, which give riseto F = G+H, F = G-H. It is easily verified that 


7The present argument only establishes non-constructively the existence of a program, based on the 
fact that truncated Taylor series converge geometrically fast at an interior point of their disc of convergence. 
Making explict this program and the involved parameters from the specification itself however represents a 
much harder problem (that of “uniformity” with respect to specifications) that is not addressed here. 
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PF = min(pc, pH). Computability is granted since the minimum of two computable 
numbers is computable. That t- = +00 in each case is immediate. 


Labelled case. The labelled case is covered by the same type of argument as 
above, the discussion being even simpler, since the ordinary exponential and logarithm 
replace the Pélya operators Exp and Log. It is still a fact that all the EGFs of infinite 
non-recursive classes are infinite at their dominant positive singularity, though the 
radii of convergence can now be of any magnitude (compared to 1). | 


[> IV.21. Restricted constructions. This is an exercise in induction. Theorem IV.8 is stated for 
specifications involving the basic constructors. Show that the conclusion still holds if the corres- 
ponding restricted constructions (R=r, R<r, Rr, with RK being any of the basic constructors) 
are also allowed. <i 


> IV.22. Syntactically decidable properties. For unlabelled classes ¥, the property pr = 1 is 
decidable. For labelled and unlabelled classes, the property pr = +00 is decidable. 


> IV.23. Pélya—Carlson and a curious property of OGFs. Here is a statement first conjectured 
by Pélya, then proved by Carlson in 1921 (see [164, p. 323]): [fa function is represented by 
a power series with integer coefficients that converges inside the unit disc, then either it is a 
rational function or it admits the unit circle as a natural boundary. This theorem applies in 
particular to the OGF of any combinatorial class. 


> IV.24. Trees are recursive structures only! General and binary trees cannot receive an iter- 
ative specification since their OGFs assume a finite value at their Pringsheim singularity. [The 
same is true of most simple families of trees; cf Proposition VL.6, p. 404]. 


> IV.25. Non-constructibility of permutations and graphs. The class P of all permutations 
cannot be specified as a constructible unlabelled class since the OGF P(z) = >7,,n!z” has 
radius of convergence 0. (It is of course constructible as a labelled class.) Graphs, whether 
labelled or unlabelled, are too numerous to form a constructible class. 


Theorem IV.8 establishes a link between analytic combinatorics, computability 
theory, and symbolic manipulation systems. It is based on an article of Flajolet, Salvy, 
and Zimmermann [255] devoted to such computability issues in exact and asymptotic 
enumeration. Recursive specifications are not discussed now since they tend to give 
rise to branch points, themselves amenable to singularity analysis techniques to be 
fully developed in Chapters VI and VII. The inductive process, implied by the proof of 
Theorem IV.8, that decorates a specification with the radius of convergence of each of 
its subexpressions, provides a practical basis for determining the exponential growth 
rate of counts associated to a non-recursive specification. 


Example IV.4.. Combinatorial trains. This purposely artificial example from [219] (see Fig- 
ure IV.7) serves to illustrate the scope of Theorem IV.8 and demonstrate its inner mechanisms 
at work. Define the class of all labelled trains by the following specification, 


Tr = Wax SEQ(Wa x SET(Pa)), 
(27) Wa = SEQs1(PE), 

Pe = Z2xZx(14+Cyc(Z)), 

Pa = Cyc(Z)*«Cyc(Z). 


In figurative terms, a train (Tr) is composed of a first wagon (Wa) to which is appended a 
sequence of passenger wagons, each of the latter capable of containing a set of passengers (Pa). 
A wagon is itself composed of “planks” (P£) conventionally identified by their two end points 
(Z x Z) and to which a circular wheel (CYC(Z)) may optionally be attached. A passenger is 
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Figure IV.7. The inductive determination of the radius of convergence of the EGF of 
trains: (left) a hierarchical view of the specification of Tr; (right) the corresponding 
radii of convergence for each subspecification. 


composed of a head and a belly that are each circular arrangements of atoms. Here is a depiction 


of a random train: 


The translation into a set of EGF equations is immediate and a symbolic manipulation system 
readily provides the form of the EGF of trains as 
-1 


2 Sheil 2 ees (1o-2a-)° 
2 (itl) fA (1+ toed 27) e 


(1-2 (1 + log = 2) Linez? (1+ log((l = 2)~) 


Tr(z)= 


together with the expansion 
2 3 4 5 6 7 
Zz Zz Zz Zz Zz Zz 
Pr(=2 5 +6 31 eS ery baer ARON ete he : 
The specification (27) has a hierarchical structure, as suggested by the top representation of 
Figure IV.7, and this structure is itself directly reflected by the form of the expression tree of the 
GF Tr(z). Then, each node in the expression tree of Tr (z) can be tagged with the corresponding 
value of the radius of convergence. This is done according to the principles of Theorem IV.8; 
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see the right diagram of Figure IV.7. For instance, the quantity 0.68245 associated to Wa(z) is 
given by the sequence rule and is determined as the smallest positive solution of the equation 


2 (1 — log(1 — 2) =1. 


The tagging process works upwards till the root of the tree is reached; here the radius of con- 
vergence of Tr is determined to be p = 0.48512---, a quantity that happens to coincide with 
the ratio [249 Tr (z)/[29ITr(z) to more than 15 decimal places. .....................00, | 


IV.5. Rational and meromorphic functions 


The last section has fully justified the First Principle of Coefficient Asymptotics 
leading to the exponential growth formula f,, > A” for the coefficients of an analytic 
function f(z). Indeed, as we saw, one has A = 1/p, where p equals both the radius of 
convergence of the series representing f and the distance of the origin to the dominant, 
i.e., closest, singularities. We are going to start examining here the Second Principle, 
already given on p. 227 and relative to the form 


with @(n) the subexponential factor: 


Second Principle of Coefficient Asymptotics. The nature of a function’s 
singularities determines the associate subexponential factor (0 (n)). 


In this section, we develop a complete theory in the case of rational functions (that is, 
quotients of polynomials) and, more generally, meromorphic functions. The net result 
is that, for such functions, the subexponential factors are essentially polynomials: 


Polar singularities ~»* subexponential factors 6(n) of polynomial growth. 


A distinguishing feature is the extremely good quality of the asymptotic approxima- 
tions obtained; for naturally occurring combinatorial problems, 15 digits of accuracy 
is not uncommon in coefficients of index as low as 50 (see Figure I'V.8, p. 260 below 
for a striking example). 


IV.5.1. Rational functions. A function f(z) is a rational function iff it is of the 
form f(z) = N(z)/D(z), with N(z) and D(z) being polynomials, which we may, 
without loss of generality, assume to be relatively prime. For rational functions that 
are analytic at the origin (e.g., generating functions), we have D(O) 4 0. 

Sequences {fn}n>o that are coefficients of rational functions satisfy linear re- 
currence relations with constant coefficients. This fact is easy to establish: com- 
pute [z”] f(z) - D(z); then, with D(z) = dp + diz +--+ + dnz'”, one has, for all 
n > deg(N(z)), 


m 
yaaa. 
j=0 


The main theorem we prove now provides an exact finite expression for coeffi- 
cients of f(z) in terms of the poles of f(z). Individual terms in these expressions are 
sometimes called exponential—polynomials. 
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Theorem IV.9 (Expansion of rational functions). [f f(z) is a rational function that is 
analytic at zero and has poles at points 41, 42, ..., Gm, then its coefficients are a sum 
of exponential—polynomials. there exist m polynomials {II ; x) , such that, for n 
larger than some fixed no, 


m 
(28) far = "IF @ = DIN @a;". 
j=l 
Furthermore the degree of I; is equal to the order of the pole of f at a; minus one. 


Proof. Since f(z) is rational it admits a partial fraction expansion. To wit: 


Cry 
f@)=2@)+ >) —~_., 
(Z — a) 
(a,r) 
where Q(z) is a polynomial of degree np := deg(N) — deg(D) if f = N/D. Here a 
ranges over the poles of f(z) andr is bounded from above by the multiplicity of a as 
a pole of f. Coefficient extraction in this expression results from Newton’s expansion, 


1 (-1)" 1 (1) (ee ' - 
=> => a . 


[z"] 


n 

(¢-a)’ at ee ey) ar r—-1 

The binomial coefficient is a polynomial of degree r — 1 in n, and collecting terms 
associated with a given a yields the statement of the theorem. | 


Notice that the expansion (28) is also an asymptotic expansion in disguise: when 
grouping terms according to the a’s of increasing modulus, each group appears to be 
exponentially smaller than the previous one. In particular, if there is a unique dominant 
pole, |ai| < |a2| < |a3| <---, then 


Th o~ a, "THi(n), 


and the error term is exponentially small as it is O(a;"n") for some r. A classical 
instance is the OGF of Fibonacci numbers, 


z 
F(z) = ———_;¥ 
@) 1—z— 2?’ 
-1 5 -1-¥5 
with poles at a2 = 0.61803 and = = —1.61803, so that 
1 “ 1 
[:"F @) = Fn = 20" - — 9" = +0(-), 


with g = (1 + /5)/2 the golden ratio, and @ its conjugate. 


> IV.26. A simple exercise. Let f(z) be as in Theorem IV.9, assuming additionally a single 
dominant pole a1, with multiplicity r. Then, by inspection of the proof of Theorem IV.9: 


Cc =—n+r_r-1 1 ‘ 
= 1+ 0{- with C= lim (z—a))’ : 
ys (r— pl i a n ae  F@) 

This is certainly the most direct illustration of the Second Principle: under the assumptions, a 
one-term asymptotic expansion of the function at its dominant singularity suffices to determine 
the asymptotic form of the coefficients. dq 
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Example YV.5. Qualitative analysis of a rational function. This is an artificial example de- 
signed to demonstrate that all the details of the full decomposition are usually not required. The 
rational function i 

a- 20-230 - §) 
has a pole of order 5 at z = 1, poles of order 2 at z = w, w* (w =e 
a pole of order 3 at z = —1, and simple poles at z = +./2. Therefore, 


n= Py(n) + Py(nyo™” + P3(nyo™" + P4(n)(—1)" + 
+Ps(n)2~"/? 4 Pe(n)(—1)"27"/2 
where the degrees of P},..., Pg are 4,1, 1,2,0,0. For an asymptotic equivalent of f,, only 
the poles at roots of unity need to be considered since they correspond to the fastest exponential 
growth; in addition, only z = 1 needs to be considered for first-order asymptotics; finally, at 


z = 1, only the term of fastest growth needs to be taken into account. In this way, we find the 
correspondence 


fo) 1 IS ues f 1 (" + ‘) nt 

z~  — gh Pk 
SPS BA 32.23. (4) 

The way the analysis can be developed without computing details of partial fraction expansion 

ISStYPICAl: inser s sates ¢ ade haw is ase so lA eee Las aaa s Cea eee ae 8 es | 


f@= 


2it/3 a cubic root of unity), 


Theorem IV.9 applies to any specification leading to a GF that is a rational func- 
tion’. Combined with the qualitative approach to rational coefficient asymptotics, it 
gives access to a large number of effective asymptotic estimates for combinatorial 
counting sequences. 


Example IV.6. Asymptotics of denumerants. Denumerants are integer partitions with sum- 
mands restricted to be from a fixed finite set (Chapter I, p. 43). We let PT be the class relative 
to set T C Zy 9, with the known OGF, 


P7 ©) = Il 1 “= 
weT 
Without loss of generality, we assume that gcd(7) = 1; that is, the coin denomination are not 
all multiples of a number d > 1. 


A particular case is the one of integer partitions whose summands are in {1,2,...,r}, 
| 
{1,...7} (7) — 
e (2) = Il j—zm- 
m=1 


The GF has all its poles being roots of unity. At z = 1, the order of the pole is r, and one has 


(wth). 1! 
P W~ > G-or’ 
as z > 1. Other poles have strictly smaller multiplicity. For instance the multiplicity of z = —1 
is equal to the number of factors (1 — 22s)71 in P{l.----"} which is the same as the number of 
coin denominations that are even; this last number is at most r — 1 since, by the gcd assumption 
gcd(Z) = 1, at least one is odd. Similarly, a primitive gth root of unity is found to have 


8In Part A, we have been occasionally led to discuss coefficients of some simple enough rational 
functions, thereby anticipating the statement of the theorem: see for instance the discussion of parts in 
compositions (p. 168) and of records in sequences (p. 190). 


258 IV. COMPLEX ANALYSIS, RATIONAL AND MEROMORPHIC ASYMPTOTICS 

multiplicity at most r — 1. It follows that the pole z = 1 contributes a term of the form n’—! 
to the coefficient of index n, while each of the other poles contributes a term of order at most 
n’—2, We thus find 


: 1 
Tycug?} iy cen! with cy = ————_.. 
ri(r—1)! 


{ 
Ph 
The same argument provides the asymptotic form of pr , since, to first order asymptotics, 
only the pole at z = 1 counts. 
Proposition IV.2. Let T be a finite set of integers without a common divisor (gcd(T) = 1). 
The number of partitions with summands restricted to T satisfies 


r-l 


T lon 
i t(r—1)! 


with t:= I] o, r:=card(T). 
ocT 
For instance, in a strange country that would have pennies (1 cent), nickels (5 cents), dimes 
(10 cents), and quarters (25 cents), the number of ways to make change for a total of n cents is 


1 1 n> n> 


& oy. = 
i G=pd= d= ANd 2D 1-5-10-25 3! ~~ 7500’ 
asymptotically. <!.sj.caswiett oe eed en baad oo ee SSA eh ee Pee ea eed | 


IV.5.2. Meromorphic functions. An expansion similar to that of Theorem IV.9 
(p. 256) holds true for coefficients of a much larger class; namely, meromorphic func- 
tions. 


Theorem IV.10 (Expansion of meromorphic functions). Let f(z) be a function mero- 
morphic at all points of the closed disc |\z| < R, with poles at points a1, 02,..., Om. 
Assume that f (z) is analytic at all points of |z| = R and at z = 0. Then there exist m 
polynomials {II ; (x)}F , Such that: 


(29) fp="lf@ = D Wj Gaz” + OR”). 


j=l 
Furthermore the degree of I; is equal to the order of the pole of f at a; minus one. 


Proof. We offer two different proofs, one based on subtracted singularities, the other 
one based on contour integration. 


(i) Subtracted singularities. Around any pole a, f(z) can be expanded locally: 


(30) f@ = Dd cazg—a)* 
k>—-M 
(31) = Sq(z) + Aa(z) 


where the “singular part” S,_(z) is obtained by collecting all the terms with index in 
[—M.. — 1] (that is, forming S,(z) = Na(z)/(z — a)” with Nq(z) a polynomial 
of degree less than M) and H,(z) is analytic at a. Thus setting S(z) := Dj Sa; (Z)s 
we observe that f(z) — S(z) is analytic for |z| < R. In other words, by collecting 
the singular parts of the expansions and subtracting them, we have “removed” the sin- 
gularities of f(z), whence the name of method of subtracted singularities sometimes 
given to the method [329, vol. 2, p. 448]. 
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Taking coefficients, we get: 


[2" f(z) = [2"]S(z) + [z"](f (z) — S(@)). 


The coefficient of [z”] in the rational function $(z) is obtained from Theorem IV.9. 
It suffices to prove that the coefficient of z” in f(z) — S(z), a function analytic for 
|z| < R,is O(R™”). This fact follows from trivial bounds applied to Cauchy’s integral 
formula with the contour of integration being 2 = {z : |z| = R}, as in the proof of 
Proposition IV.1, p 246 (saddle-point bounds): 


| d 10a 
eG @).s sty] = alf.£0 — S()) a < Doar 


~ Qn Rr . 


(ii) Contour integration. There is another line of proof for Theorem IV.10 which 
we briefly sketch as it provides an insight which is useful for applications to other 
types of singularities treated in Chapter VI. It consists in using Cauchy’s coefficient 
formula and “pushing” the contour of integration past singularities. In other words, 
one computes directly the integral 


dz 


Tn (z) etl 


2i 7 |z|=R 
by residues. There is a pole at z = 0 with residue f, and poles at the a; with residues 
corresponding to the terms in the expansion stated in Theorem IV.10; for instance, if 
f@) ~c/(z-—a)asz—- a, then 


=n 135,» oz c zor sy _ ¢ 
Res(f(z)z™ =a) = Res (a n pea) =o. 


Finally, by the same trivial bounds as before, J, is O(R~"). a 
> IV.27. Effective error bounds. The error term O(R~") in (29), call it én, satisfies 
len] < Ro" - sup |f()I- 
IZ|=R 

This results immediately from the second proof. This bound may be useful, even in the case of 
rational functions to which it is clearly applicable. dq 

As a consequence of Theorem IV.10, all GFs whose dominant singularities are 
poles can be easily analysed. Prime candidates from Part A are specifications that 
are “driven” by a sequence construction, since the translation of sequences involves a 


quasi-inverse, itself conducive to polar singularities. This covers in particular surjec- 
tions, alignments, derangements, and constrained compositions, which we treat now. 


Example IV.7.  Surjections. These are defined as sequences of sets (R = SEQ(SETs1(Z))) 
with EGF R(z) = (2 — e%)~! (see p. 106). We have already determined the poles in Exam- 
ple IV.2 (p. 244), the one of smallest modulus being at log2 = 0.69314. At this dominant 
pole, one finds R(z) ~ — 5 (z — log ayo This implies an approximation for the number of 
surjections: 


; n! 1 ya 
Rn =n! [z"]R(z) ~ EC), with ¢(1) = oS (as) , 
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3 

dp 

4683 

545835 

102247563 

28091567595 

10641342970443 

531565468 1981355 

3385534663256845323 
2677687796244384203 115 

2574844419803 190384544203 
2958279121074145472650648875 
4002225759844 168492486127539083 
62975620649500660335 18373935334635 
1140356879401 1880483742464 196184901963 
23545 154085734896649 184490637 144855476395 
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Figure IV.8. The surjection numbers pyramid: for n = 2,4, ..., 32, the exact values 
of the numbers R,, (left) compared to the approximation [€(n)]| with discrepant digits 
in boldface (right). 


Figure IV.8 gives, forn = 2,4,...,32, a table of the values of the surjection numbers (left) 
compared with the asymptotic approximation rounded? to the nearest integer, [E(n)|: It is 
piquant to see that [€(n)] provides the exact value of Ry for all values of n = 1,..., 15, and 
it starts losing one digit for n = 17, after which point a few “wrong” digits gradually appear, 
but in very limited number; see Figure IV.8. (A similar situation prevails for tangent numbers 
discussed in our Jnvitation, p. 5.) The explanation of such a faithful asymptotic representation 
owes to the fact that the error terms provided by meromorphic asymptotics are exponentially 
small. In effect, there is no other pole in |z| < 6, the next ones being at log2 + 2iz with 
modulus of about 6.32. Thus, for ry = [z’"] R(z), there holds 


1 n+1 
: —n 
(ss) + 0(6"). 


For the double surjection problem, R*(z) = (2 + z — e*), we get similarly 


Rn 1 


2) n! 2 


[2 ]R*(z) ~ es ae 


ep — 1 
with p* = 1.14619 the smallest positive root of e?” — p* =2. oo. ...c ccc ccce eee eee eens | 


It is worth reflecting on this example as it is representative of a “production chain” 
based on the two successive implications which are characteristic of Part A and Part B 
of the book: 


R = SEQ(SET>1(Z)) = R@)= 5 
p> 1 : 
R ~ = > Rn ~ 5(log2)-"—". 
Ceee ie 2 (z — log 2) nt" a 


°The notation [x] represents x rounded to the nearest integer: [x] := |x + 5]. 
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The first implication (written “==>”, as usual) is provided automatically by the sym- 
bolic method. The second one (written here “——>”’) is a direct translation from the ex- 
pansion of the GF at its dominant singularity to the asymptotic form of coefficients; it 
is valid conditionally upon complex analytic conditions, here those of Theorem IV. 10. 


Example IV.8. Alignments. These are sequences of cycles (O = SEQ(CYC(Z)), p. 119) with 
EGF 
1 


O(z) = ————_. 
1 — log p44, 


There is a singularity when log(1 — z)~! = 1, which is at p = 1 — e—! and which arises before 
z = 1, where the logarithm becomes singular. Then, the computation of the asymptotic form of 
[z’]O(z) only requires a local expansion near p, 

—e7! eT! 


cas ease are 2 [z 1O@) Gare 


and the coefficient estimates result from Theorem IV.10. ........... 0.0.0 cece eee ee eee || 


> IV.28. Some “supernecklaces”. One estimates 


1 1 
(2) tos | 2 0 ey 
1 —log n 


where the EGF enumerates labelled cycles of cycles (supernecklaces, p. 125). [Hint: Take 
derivatives. ] 


Example IV.9. Generalized derangements. The probability that the shortest cycle in a random 
permutation of size n has length larger than k is 

1 Zz Z2 zk 

[e"]D (2), where D(z) = ja axe E, 

=% 
as results from the specification Dw) = SET(CYC,;(Z)). For any fixed k, one has (easily) 
p&) (z) ~ e7 Hk /(1 — z) as z > 1, with 1 being a simple pole. Accordingly the coefficients 
[z?]D®) (z) tend to e~ Hk asn > oo. In summary, due to meromorphy, we have the character- 
istic implication 
eW Hk 
1— 
Since there is no other singularity at a finite distance, the error in the approximation is (at least) 
exponentially small, 


D®@) ~ > [ze D®G ~ eH, 


1 2 zk 
(33) [pe Ee ae Me + OR™), 

aes 
for any R > 1. The cases k = 1, 2 in particular justify the estimates mentioned at the beginning 
of this.chapteron’p. 2282. ¢:6. encase earn heneiadelasngcu ech hae retell wesw dd ewe be ea editls | 


This example is also worth reflecting upon. In prohibiting cycles of length < k, 
we modify the EGF of all permutations, (1 — z)~! by a factor en/lo—24/k The 
resulting EGF is meromorphic at 1; thus only the value of the modifying factor at 
z = 1 matters, so that this value, namely e~ 4, provides the asymptotic proportion 
of k-derangements. We shall encounter more and more shortcuts of this sort as we 
progress into the book. 
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> IV.29. Shortest cycles of permutations are not too long. Let Sy be the random variable 
denoting the length of the shortest cycle in a random permutation of size n. Using the circle 


|z| = 2 to estimate the error in the approximation e— Ax above, one finds that, for k < logn, 
1 Qk+1 


P(Sp > k) —e7 He See 


which is exponentially small in this range of k-values. Thus, the approximation e~ #« remains 
usable when k is allowed to tend sufficiently slowly to co with n. One can also explore the 
possibility of better bounds and larger regions of validity of the main approximation. (See 
Panario and Richmond’s study [470] for a general theory of smallest components in sets.) < 


> IV.30. Expected length of the shortest cycle. The classical approximation of the harmonic 
numbers, H; © logk + y, suggests e~” /k as a possible approximation to (33) for both large n 
and large k in suitable regions. In agreement with this heuristic argument, the expected length 
of the shortest cycle in a random permutation of size n is effectively asymptotic to 


n =¥% 
>> eas See logn, 
k 
k=1 


a property first discovered by Shepp and Lloyd [540]. dq 


The next example illustrates the analysis of a collection of rational generating 
functions (Smirnov words) paralleling nicely the enumeration of a special type of 
integer composition (Carlitz compositions), which belongs to meromorphic asymp- 
totics. 


Example IV.10. Smirnov words and Carlitz compositions. Bernoulli trials have been discussed 
in Chapter III (p. 204), in relation to weighted word models. Take the class W of all words over 
an r—ary alphabet, where letter j is assigned probability p; and letters of words are drawn 
independently. With this weighting, the GF of all words is W(z) = 1/(l — 3} pjz) = U- 
zal. Consider the problem of determining the probability that a random word of length n is of 
Smirnov type, that is, all blocks of length 2 are formed with unequal letters. In order to avoid 
degeneracies, we impose r > 3 (since for r = 2, the only Smirnov words are ababa...and 
babab...). 

By our discussion in Example III.24 (p. 204), the GF of Smirnov words (again with the 
probabilistic weighting) is 

1 


> Pe 
| Ee aay: 


By monotonicity of the denominator, this rational function has a dominant singularity at the 
unique positive solution of the equation 


S(z) = 


r 


PjP 
(34) —_ =1, 
2 rege 


and the point p is a simple pole. Consequently, p is a well-characterized algebraic number 
defined implicitly by a polynomial equation of degree < r. One can furthermore check, by 
studying the variations of the denominator, that the other roots are all real and negative; thus, 
p is the unique dominant singularity. (Alternatively, appeal to the Perron—Frobenius argument 
of Example V.11, p. 349) It follows that the probability for a word to be Smimov is, not too 
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surprisingly, exponentially small, the precise formula being 
-1 
- PjP 
IS ZIrec’ po, C= Fie: 
2 (1+ pjp)? 


j=l 


A similar analysis, using bivariate generating functions, shows that in a random word of length n 
conditioned to be Smirnov, the letter j appears with asymptotic frequency 


ial IDF wT Pi 
os “-Otappe 8 Lata 
in the sense that the mean number of occurrences of letter j is asymptotic to qjn. All these 
results are seen to be consistent with the equiprobable letter case pj = 1/r, for which p = 
r/(r —1). 

Carlitz compositions illustrate a limit situation, in which the alphabet is infinite, while 
letters have different sizes. Recall that a Carlitz composition of the integer n is a composition 
of n such that no two adjacent summands have equal value. By Note III.32, p. 201, such 
compositions can be obtained by substitution from Smirnov words, to the effect that 


= 
oo 


j 
(36) K@ = (1- >> —— 
= 


The asymptotic form of the coefficients then results from an analysis of dominant poles. The 
OGF has a simple pole at p, which is the smallest positive root of the equation 


pi 
(37) aa roa 4) 


(Note the analogy with (34) due to commonality of the combinatorial argument.) Thus: 


Kn~C-B", C = 0.45636 34740, 6 = 1.75024 12917. 


There, 6 = 1/p with p as in (37). In a way analogous to Smirnov words, the asymptotic 
frequency of summand k appears to be proportional to kpk /GA+ pk); see [369, 421] for further 
PLOPSTUES 9% wont a Ss hasta SOR eee legals Be EAA EES OES RES Gita FR ORES eeu aeES a 


IV. 6. Localization of singularities 


There are situations where a function possesses several dominant singularities, 
that is, several singularities are present on the boundary of the disc of convergence. 
We examine here the induced effect on coefficients and discuss ways to locate such 
dominant singularities. 


IV.6.1. Multiple singularities. In the case when there exists more than one 
dominant singularity, several geometric terms of the form £” sharing the same mod- 
ulus (and each carrying its own subexponential factor) must be combined. In simpler 
situations, such terms globally induce a pure periodic behaviour for coefficients that is 
easy to describe. In the general case, irregular fluctuations of a somewhat arithmetic 
nature may prevail. 
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Figure IV.9. The coefficients [z”]f(z) of the rational function f(z) = 


=3 ei! 
(1 + 1.0224) (1 - 1.0525) illustrate a periodic superposition of regimes, de- 


pending on the residue class of n modulo 40. 


Pure periodicities. When several dominant singularities of f(z) have the same 
modulus and are regularly spaced on the boundary of the disc of convergence, they 
may induce complete cancellations of the main exponential terms in the asymptotic 
expansion of the coefficient f,,. In that case, different regimes will be present in the 
coefficients f, based on congruence properties of n. For instance, the functions 


1 

1+ 2? 1-23 

exhibit patterns of periods 4 and 3, respectively, this corresponding to poles that are 
roots of unity or order 4 (+i), and 3 (w : w* = 1). Then, the function 


=1-27 47-2 +28-..., =14¢ 742422 4---, 


gceles 2 2-274 234724478 47? —z!0 

14220 1-23 Ve 

has coefficients that obey a pattern of period 12 (for example, the coefficients ¢, such 
thatn = 1,5, 6,7, 11 modulo 12 are zero). Accordingly, the coefficients of 


o(Z) = 


1 
[2"]w(z) where = w(z) = ¢(@) + ——,, 
1—2z/2 
manifest a different exponential growth when n is congruent to 1,5, 6,7, 11 mod 12. 
See Figure IV.9 for such a superposition of pure periodicities. In many combinatorial 
applications, generating functions involving periodicities can be decomposed at sight, 
and the corresponding asymptotic subproblems generated are then solved separately. 
> IV.31. Decidability of polynomial properties. Given a polynomial p(z) € Q[z], the following 
properties are decidable: (7) whether one of the zeros of p is a root of unity; (ii) whether one 


of the zeros of p has an argument that is commensurate with z. [One can use resultants. An 
algorithmic discussion of this and related issues is given in [306].] J 


Nonperiodic fluctuations. As a representative example, consider the polynomial 
D(z) =1- Sz + z2, whose roots are 

34 3 4 

5 


a= —t+i-, a=--i 


5 5 R) 
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Figure IV.10. The coefficients of f(z) = 1/(. -— gz + z) exhibit an apparently 
chaotic behaviour (left) which in fact corresponds to a discrete sampling of a sine 
function (right), reflecting the presence of two conjugate complex poles. 


both of modulus 1 (the numbers 3, 4,5 form a Pythagorean triple), with argument 
+09 where 4 = arctan() = 0.92729. The expansion of the function f(z) = 1/D(z) 
starts as 

1, 84 3; 779 4 2574 


6 1 
= 1 _ 
(Stee Se Oe ae es IS 


the sign sequence being 
+4+4+—---4+444---444----4+4+4+----+4+4---, 
which indicates a somewhat irregular oscillating behaviour, where blocks of three or 
four pluses follow blocks of three or four minuses. 
The exact form of the coefficients of f results from a partial fraction expansion: 
b 1 3 1 3, 


a 
— ith =n7 =i b= =--- 
POS tay eee Or rae? 2 8” 


where a = e!%, @ = e~' Accordingly, 


sin((n + 1)@) 

sin(@) 
This explains the sign changes observed. Since the angle 4 is not commensurate with 
z, the coefficients fluctuate but, unlike in our earlier examples, no exact periodicity is 
present in the sign patterns. See Figure IV.10 for a rendering and Figure V.3 (p. 299) 
for a meromorphic case linked to compositions into prime summands. 

Complicated problems of an arithmetical nature may occur if several such singu- 
larities with non-commensurate arguments combine, and some open problem remain 
even in the analysis of linear recurring sequences. (For instance no decision proce- 
dure is known to determine whether such a sequence ever vanishes [200].) Fortunately, 
such problems occur infrequently in combinatorial applications, where dominant poles 
of rational functions (as well as many other functions) tend to have a simple geometry 
as we explain next. 


> IV.32. Irregular fluctuations and Pythagorean triples. The quantity @)/z is an irrational 
number, so that the sign fluctuations of (38) are “irregular” (i.e., non-purely periodic). [Proof: 
a contrario. Indeed, otherwise, a = (3 + 4i)/5 would be a root of unity. But then the minimal 


(38) fraae + heh = 
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polynomial of a would be a cyclotomic polynomial with non-integral coefficients, a contradic- 
tion; see [401, VIII.3] for the latter property. ] dq 


> IV.33. Skolem-Mahler-Lech Theorem. Let fn be the sequence of coefficients of a rational 
function, f(z) = A(z)/B(z), where A, B € Q[z]. The set of all n such that f, = 0 is the 
union of a finite (possibly empty) set and a finite number (possibly zero) of infinite arithmetic 
progressions. (The proof is based on p-adic analysis, but the argument is intrinsically non- 
constructive; see [452] for an attractive introduction to the subject and references.) <J 


Periodicity conditions for positive generating functions. By the previous dis- 
cussion, it is of interest to locate dominant singularities of combinatorial generating 
functions, and, in particular, determine whether their arguments (the “dominant direc- 
tions’) are commensurate to 27. In the latter case, different asymptotic regimes of the 
coefficients manifest themselves, depending on the congruence properties of n. 


Definition IV.5. For a sequence (fn) with GF f(z), the support of f, denoted Supp(f), 
is the set of alln such that fy, 4 0. The sequence (fn), as well as its GF f(z), is said 
to admit a span d if for some r, there holds 


Supp(f) Cr+dZs0 = {r, r+d, r+2d,...}. 


The largest span, p, is the period, all other spans being divisors of p. If the period is 
equal to 1, the sequence and its GF are said to be aperiodic. 


If f is analytic at 0, with span d, there exists a function g analytic at 0 such 
that f(z) = z’g(z¢), for some r € Zs. With E := Supp(f), the maximal span 
[the period] is determined as p = gcd(E — E) (pairwise differences) as well as p = 
gcd(E — {r}) where r := min(£). For instance sin(z) has period 2, cos(z) + cosh(z) 
has period 4, Bee has period 5, and so on. 

In the context of periodicities, a basic property is expressed by what we have 
chosen to name figuratively the “Daffodil Lemma’. By virtue of this lemma, the span 
of a function f with non-negative coefficients is related to the behaviour of | f (z)| as 
z varies along circles centred at the origin (Figure IV.11). 


Lemma IV.1 (“Daffodil Lemma”). Let f(z) be analytic in |z| < p and have non- 
negative coefficients at 0. Assume that f does not reduce to a monomial and that for 
some non-zero non-positive z satisfying |z| < p, one has 


If@)| = fzl). 


Then, the following hold: (i) the argument of z must be commensurate to 27, i.e., 
z = Re’ with 0/(2x) = a € Q (an irreducible fraction) and0 <r < p; (ii) f 
admits p as a span. 


Proof. This classical lemma is a simple consequence of the strong triangle inequality. 
Indeed, for Part (i) of the statement, with z = Re!’, the equality | f(z)| = f (Izl) 
implies that the complex numbers f, R”e!””, for n € Supp(f), all lie on the same ray 
(a half-line emanating from 0). This is impossible if 9/(27:) is irrational, since, by as- 
sumption, the expansion of f contains at least two monomials (one cannot have nj@ = 
n20 (mod 27)). Thus, @/(2z) = r/p isa rational number. Regarding Part (ii), con- 
sider two distinct indices n; and n2 in Supp(f) and let 0/(2z) = r/p. Then, by 
the strong triangle inequality again, one must have (n, — n2)@ = O (mod 27); that 


IV.6. LOCALIZATION OF SINGULARITIES 267 


Figure IV.11. Illustration of the “Daffodil Lemma”: the images of circles z = Re’? 
(R = 0.4..0.8) rendered by a polar plot of | f(z)| in the case of f(z) = Je + 
z2/(1 — z!®)), which has span 5. 


is, (nj — nj)r/p = (ky — ky), for some kj,kyz € Z > 0. This is only possible if p 
divides n, — n2. Hence, p is a span. |_| 


Berstel [53] first realized that rational generating functions arising from regular 

languages can only have dominant singularities of the form pa, where o is a certain 
root of unity. This property in fact extends to many non-recursive specifications, as 
shown by Flajolet, Salvy, and Zimmermann in [255]. 
Proposition IV.3 (Commensurability of dominant directions). Let S be a constructible 
labelled class that is non-recursive, in the sense of Theorem IV.8. Assume that the 
EGF S(z) has a finite radius of convergence p. Then there exists a computable inte- 
gerd > 1 such that the set of dominant singularities of S(z) is contained in the set 
{pa!}, where wf = 1. 


Proof. (Sketch; see [53, 255]) By definition, a non-recursive class S is obtained from 
1 and Z by means of a finite number of union, product, sequence, set, and cycle 
constructions. We have seen earlier, in Section IV. 4 (p. 249), an inductive algorithm 
that determines radii of convergence. It is then easy to enrich that algorithm and 
determine simultaneously (by induction on the specification) the period of its GF and 
the set of dominant directions. 

The period is determined by simple rules. For instance, if S = Tx (S = T -U) 
and T, U are infinite series with respective periods p, q, one has the implication 


Supp(T) Ca+ pZ, Supp(U) Cb+qZ = Supp(S) Ca+b+€Z, 
with € = gcd(p, q). Similarly, for S = SEQ(T), 
Supp(7) Ca+ pZ = Supp(S) C dZ, 


where now 6 = gcd(a, p). 
Regarding dominant singularities, the case of a sequence construction is typical. 
It corresponds to g(z) = (1 — f(z))7!. Assume that f(z) = z“h(z?), with p the 
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maximal period, and let p > 0 be such that f(p) = 1. The equations determining 
any dominant singularity ¢ are f(¢) = 1, |¢| = p. In particular, the equations imply 
If(O| = fdC|), so that, by the Daffodil Lemma, the argument of ¢ must be of the 
form 2zr/s. An easy refinement of the argument shows that, for d = gcd(a, p), all the 
dominant directions coincide with the multiples of 27/6. The discussion of cycles is 
entirely similar since log(1 — f)~! has the same dominant singularities as (1 — f)~!. 
Finally, for exponentials, it suffices to observe that e/ does not modify the singularity 
pattern of f, since exp(z) is an entire function. | 


> IV.34. Daffodil lemma and unlabelled classes. Proposition IV.3 applies to any unlabelled 
class S that admits a non-recursive specification, provided its radius of convergence p satisfies 
p <1. (When p = 1, there is a possibility of having the unit circle as a natural boundary, a 
property that is otherwise decidable from the specification.) The case of regular specifications 
will be investigated in detail in Section V.3, p. 300. dq 


Exact formulae. The error terms appearing in the asymptotic expansion of coef- 
ficients of meromorphic functions are already exponentially small. By peeling off the 
singularities of a meromorphic function layer by layer, in order of increasing modulus, 
one is led to extremely precise, sometimes even exact, expansions for the coefficients. 
Such exact representations are found for Bernoulli numbers B,,, surjection numbers 
R,, as well as Secant numbers £2, and Tangent numbers E>,41, defined by 


oo zn Z 
» Bn— = (Bernoulli numbers) 
n! e—] 
n=0 
ae 1 
Ry, — (Surjection numbers) 
= n} 2-—e 
(39) Se 
2d Eon Gn! Be Soe @ (Secant numbers) 
n= 
oO antl 
E —— = tan Tangent numbers). 
2d +1 Gay yl (z) (Tang ) 


Bernoulli numbers. These numbers traditionally written B, can be defined by their 
EGF B(z) = z/(e* — 1), and they are central to Euler—-Maclaurin expansions (p. 726). 
The function B(z) has poles at the points y, = 2ika, with k € Z\ {0}, and the residue 
at vx is equal to yx, 
cS Xk 
eal 2X 


(<> yx). 


The expansion theorem for meromorphic functions is applicable here: start with the 
Cauchy integral formula, and proceed as in the proof of Theorem IV.10, using as 
external contours a large circle of radius R that passes half-way between poles. As R 
tends to infinity, the integrand tends to 0 (as soon as n > 2) because the Cauchy kernel 
z—"—! decreases as an inverse power of R while the EGF remains O(R). In the limit, 
corresponding to an infinitely large contour, the coefficient integral becomes equal to 
the sum of all residues of the meromorphic function over the whole of the complex 
plane. 
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From this argument, we get the representation B, = —n! DkeZ\{0} X, - This 
verifies that B, = 0 if n is odd and n > 3. If n is even, then grouping terms two by 
two, we get the exact representation (which also serves as an asymptotic expansion): 


co 


Bon —1yj1-2n_ —2, 1 
40 =(-1)" 200-77," — 
(40) Gay OD 1 Di 
Reverting the equality, we have also established that 
Bo eT Z 
— (_14)2—-152n-1_ 2n n : oe me = n 
Cn) = (“112 la with 0) = Dag Bn = nie". 


a well-known identity that provides values of the Riemann zeta function ¢(s) at even 
integers as rational multiples of powers of z. 


Surjection numbers. In the same vein, the surjection numbers have EGF R(z) = 
(2 — e?)~! with simple poles at 
1 1 


Xk = log2 + 2ikx where R(z)~ = ; 
2 Pf eee 


Since R(z) stays bounded on circles passing half-way in between poles, we find the 
exact formula, R, = xn! rez Xe ml An equivalent real formulation is 


Re. Def du Ne bye 2k 
ay “=, + = arctan), 
n!  2\log2 ar (log? 2 + 4k2n2)(n+1)/2 log 2 


which exhibits infinitely many harmonics of fast decaying amplitude. 


> IV.35. Alternating permutations, tangent and secant numbers. The relation (40) also provides 


a representation of the tangent numbers since Ey), = (—1)"—! Bo, 47(4" — 1)/(2n). The 
secant numbers E>, satisfy 


co 


(-1)* 7 (x /2)2"41 
2 (2k + 12741 ~~ 2 (2n)! Eon, 


k=1 
which can be read either as providing an asymptotic expansion of E>, or as an evaluation of the 
sums on the left (the values of a Dirichlet L-function) in terms of z. The asymptotic number of 
alternating permutations (pp. 5 and 143) is consequently known to great accuracy. dq 
> IV.36. Solutions to the equation tan(x) = x. Let xy be the nth positive root of the equation 


tan(x) = x. For any integer r > 1, the sum S(r) := >), es is a computable rational number. 
For instance: Sy = 1/10, Sy = 1/350, Sg = 1/7875. [From mathematical folklore. ] J 


IV. 6.2. Localization of zeros and poles. We gather here a few results that often 
prove useful in determining the location of zeros of analytic functions, and hence of 
poles of meromorphic functions. A detailed treatment of this topic may be found in 
Henrici’s book [329, §4.10]. 

Let f(z) be an analytic function in a region Q and let y be a simple closed curve 
interior to Q, and on which f is assumed to have no zeros. We claim that the quantity 


1 x 
(42) Ven [2 


2ix Jy fZ) He 
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exactly equals the number of zeros of f inside y counted with multiplicity. [Proof: 
the function f’/f has its poles exactly at the zeros of f, and the residue at each pole a 
equals the multiplicity of @ as a root of f; the assertion then results from the residue 
theorem. ] 

Since a primitive function (antiderivative) of f’/f is log f, the integral also 
represents the variation of log f along y, which is written [log f],. This varia- 
tion itself reduces to 2iz times the variation of the argument of f along y, since 
log(re!?) = logr + i@ and the modulus r has variation equal to 0 along a closed 
contour ({logr], = 0). The quantity [0], is, by its definition, 27 multiplied by 
the number of times the transformed contour f(y) winds about the origin, a number 
known as the winding number. This observation is known as the Argument Principle: 


Argument Principle. The number of zeros of f (z) (counted with multiplic- 
ities) inside the simple loop y equals the winding number of the transformed 
contour f(y) around the origin. 


By the same argument, if f is meromorphic in Q 5 y, then N(f; y ) equals the differ- 
ence between the number of zeros and the number of poles of f inside y , multiplicities 
being taken into account. Figure IV.12 exemplifies the use of the argument principle 
in localizing zeros of a polynomial. 

By similar devices, we get Rouché’s theorem: 


Rouché’s theorem. Let the functions f(z) and g(z) be analytic in a region 
containing in its interior the closed simple curve y. Assume that f and g 
satisfy |g(z)| < |f(z)| on the curve y. Then f(z) and f(z) + g(z) have the 
same number of zeros inside the interior domain delimited by y . 


An intuitive way to visualize Rouché’s Theorem is as follows: since |g| < |/|, then 
f(y) and (f + g)(y) must have the same winding number. 


> IV.37. Proof of Rouché’s theorem. Under the hypothesis of Rouché’s theorem, for0 < t < 1, 
the function h(z) = f(z) + tg(z) is such that N (A; y ) is both an integer and an analytic, hence 
continuous, function of tf in the given range. The conclusion of the theorem follows. J 


> IV.38. The Fundamental Theorem of Algebra. Every complex polynomial p(z) of degree n 
has exactly n roots. A proof follows by Rouché’s theorem from the fact that, for large enough 
|z| = R, the polynomial assumed to be monic is a “perturbation” of its leading term, z”. [Other 
proofs can be based on Liouville’s Theorem (Note IV.7, p. 237) or on the Maximum Modulus 
Principle (Theorem VII.1, p. 545).] J 


> IV.39. Symmetric function of the zeros. Let Sx(f; y) be the sum of the kth powers of the 
roots of equation f(z) = 0 inside y. One has 


1 f'@ k 
SKS) = z dz, 
202 leaves 7 
by a variant of the proof of the Argument Principle. dq 


These principles form the basis of numerical algorithms for locating zeros of an- 
alytic functions, in particular the ones closest to the origin, which are of most interest 
to us. One can start from an initially large domain and recursively subdivide it until 
roots have been isolated with enough precision—the number of roots in a subdomain 
being at each stage determined by numerical integration; see Figure IV.12 and refer 
for instance to [151] for a discussion. Such algorithms even acquire the status of full 
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Figure IV.12. The transforms of y ; = {|z| = Wy by Pa(z) =1—2z24+ 24, for j = 
1,2, 3,4, demonstrate, via winding numbers, that P4(z) has no zero inside |z| < 0.4, 
one zero inside |z| < 0.8, two zeros inside |z| < 1.2 and four zeros inside |z| < 1.6. 
The actual zeros are at pg = 0.54368, 1 and 1.11514 + 0.771847. 


proofs if one operates with guaranteed precision routines (using, for instance, careful 
implementations of interval arithmetics). 


IV. 6.3. Patterns in words: a case study. Analysing the coefficients of a sin- 
gle generating function that is rational is a simple task, often even bordering on the 
trivial, granted the exponential—polynomial formula for coefficients (Theorem IV.9, 
p. 256). However, in analytic combinatorics, we are often confronted with problems 
that involve an infinite family of functions. In that case, Rouché’s Theorem and the 
Argument Principle provide decisive tools for localizing poles, while Theorems IV.3 
(Residue Theorem, p. 234) and IV.10 (Expansion of meromorphic functions, p. 258) 
serve to determine effective error terms. An illustration of this situation is the analysis 
of patterns in words for which GFs have been derived in Chapters I (p. 60) and III 
(p. 212). 


Example IV.11. Patterns in words: asymptotics. All patterns are not born equal. Surprisingly, 
in a random sequence of coin tossings, the pattern HTT is likely to occur much sooner (after 
8 tosses on average) than the pattern HHH (needing 14 tosses on average); see the preliminary 
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Length (k) types c(z) p 
k=3 aab, abb, bba, baa 1 0.61803 
aba, bab 1+ 22 0.56984 
aaa, bbb L+z+27 0.54368 
k=4 aaab, aabb, abbb, 
bbba, bbaa, baaa 1 0.54368 
aaba, abba, abaa, 
bbab, baab, babb 1423 0.53568 
abab, baba 1+2* 0.53101 
aaaa, bbbb l4+z4+27423 0.51879 


Figure IV.13. Patterns of length 3,4: autocorrelation polynomial and dominant 
poles of S(z). 


discussion in Example I.12 (p. 59). Questions of this sort are of obvious interest in the statistical 
analysis of genetic sequences [414, 603]. Say you discover that a sequence of length 100,000 on 
the four letters A, G, C, T contains the pattern TACTAC twice. Can this be assigned to chance 
or is this likely to be a meaningful signal of some yet unknown structure? The difficulty here 
lies in quantifying precisely where the asymptotic regime starts, since, by Borges’s Theorem 
(Note 1.35, p. 61), sufficiently long texts will almost certainly contain any fixed pattern. The 
analysis of rational generating functions supplemented by Rouché’s theorem provides definite 
answers to such questions, under Bernoulli models at least. 

We consider here the class W of words over an alphabet A of cardinality m > 2. A 
pattern p of some length k is given. As seen in Chapters I and III, its autocorrelation polynomial 
is central to enumeration. This polynomial is defined as c(z) = a, Cj z/, where c j is 1 if 
p coincides with its jth shifted version and 0 otherwise. We consider here the enumeration of 
words containing the pattern p at least once, and dually of words excluding the pattern p. In 
other words, we look at problems such as: What is the probability that a random text of length n 
does (or does not) contain your name as a block of consecutive letters? 


The OGF of the class of words excluding p is, we recall, 
c(Z) 
zk + (1 — mz)e(z)” 


(Proposition I.4, p. 61), and we shall start with the case m = 2 of a binary alphabet. The func- 
tion S(z) is simply a rational function, but the location and nature of its poles is yet unknown. 


(43) S(@) = 


We only know a priori that it should have a pole in the positive interval somewhere between 5 
and 1 (by Pringsheim’s Theorem and since its coefficients are in the interval [1, 2”], for n large 
enough). Figure IV.13 gives a small list, for patterns of length k = 3, 4, of the pole p of S(z) 
that is nearest to the origin. Inspection of the figure suggests p to be close to ; as soon as the 
pattern is long enough. We are going to prove this fact, based on Rouché’s Theorem applied to 
the denominator of (43). 

As regards termwise domination of coefficients, the autocorrelation polynomial lies be- 
tween | (for less correlated patterns like aaa...ab) and 1+7z+---+ zk-1 (for the special 
case aaa...aa). We set aside the special case of p having only equal letters, i.e., a “maxi- 
mal” autocorrelation polynomial—this case is discussed at length in the next chapter. Thus, in 
this scenario, the autocorrelation polynomial starts as 1 + zé 4... for some € > 2. Fix the 
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Figure IV.14. Complex zeros of z3! + (1 — 2z)c(z) represented as joined by a poly- 
gonal line: (left) correlated pattern a(ba)!> ; (tight) uncorrelated pattern a(ab)}, 


number A = 0.6, which proves suitable for our subsequent analysis. On |z| = A, we have 


A2 
1-A 


(44) Ie(z)| > 1-(4 +43 4...)) =|1- 


In addition, the quantity (1 — 2z) ranges over the circle of diameter [—0.2, 1.2] as z varies along 
|z| = A, so that |1 — 2z| > 0.2. Allin all, we have found that, for |z| = A, 


|(1 — 2z)e(z)| > 0.02. 


On the other hand, for k > 7, we have [zk | < 0.017 on the circle |z| = A. Then, among 
the two terms composing the denominator of (43), the first is strictly dominated by the second 
along |z| = A. By virtue of Rouché’s Theorem, the number of roots of the denominator inside 
|z| < A is then same as the number of roots of (1 — 2z)c(z). The latter number is 1 (due to the 
root 5) since c(z) cannot be 0 by the argument of (44). Figure IV.14 exemplifies the extremely 
well-behaved characters of the complex zeros. 

In summary, we have found that for all patterns with at least two different letters (€ > 2) 
and length k > 8, the denominator has a unique root in |z| < A = 0.6. The same property 
for lengths k satisfying 4 < k < 7 is then easily verified directly. The case £ = 1 where we 
are dealing with long runs of identical letters can be subjected to an entirely similar argument 
(see also Example V.4, p. 308, for details). Therefore, unicity of a simple pole p of S(z) in the 
interval (0.5, 0.6) is granted, for a binary alphabet. 

It is then a simple matter to determine the local expansion of S(z) near z = p, 


Sai ; Ase c(p) ; 

Sp pz 2e(p) — (1 — 2p)e!(p) — kp! 
from which a precise estimate for coefficients results from Theorems IV.9 (p. 256) and IV.10 
(p. 258). 

The computation finally extends almost verbatim to non-binary alphabets, with p being 
now close to 1/m. It suffices to use the disc of radius A = 1.2/m. The Rouché part of the 
argument grants us unicity of the dominant pole in the interval (1/m, A) for k > 5 when 
m = 3, and for k > 4 and any m > 4. (The remaining cases are easily checked individually.) 
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Proposition IV.4. Consider an m-ary alphabet. Let p be a fixed pattern of length k > 4, with 
autocorrelation polynomial c(z). Then the probability that a random word of length n does not 
contain p as a pattern (a block of consecutive letters) satisfies 


5 n 
(45) Py, (p does not occur) = Ap(mp)—"! +O ((2) ) , 
where p = Pp is the unique root in i s) of the equation z& + (1 — mz)c(z) = 0 and 


Ap := me(p)/(me(p) — e'(p)(1 — mp) — kp*!). 

Despite their austere appearance, these formulae have indeed a fairly concrete content. 
First, the equation satisfied by p can be put under the form mz = 1+ zk /c(z), and, since 
p is close to 1/m, we may expect the approximation (remember the use of “*” as meaning 
“numerically approximately equal”, but not implying strict asymptotic equivalence) 


1 
mp ~1+——, 
ym 
where y := c(m7!) satisfies 1 < y < m/(m — 1). By similar principles, the probabilities 


in (45) are approximately 
| ‘ —n/(ym*) 
~ ae xe VY 
Pyy, (p does not occur) (1 + Ff :) e : 


For a binary alphabet, this tells us that the occurrence of a pattern of length k starts becoming 
likely when n is of the order of 2*, that is, when k is of the order of logy n. The more precise 
moment when this happens must depend (via y) on the autocorrelation of the pattern, with 
strongly correlated patterns having a tendency to occur a little late. (This vastly generalizes our 
empirical observations of Chapter I.) However, the mean number of occurrences of a pattern in 
a text of length n does not depend on the shape of the pattern. The apparent paradox is easily 
resolved, as we already observed in Chapter I: correlated patterns tend to occur late, while 
being prone to appear in clusters. For instance, the “late” pattern aaa, when it occurs, still has 
probability 5 to occur at the next position as well and cash in another occurrence; in contrast no 
such possibility is available to the “early” uncorrelated pattern aab, whose occurrences must 
be somewhat spread out. 

Such analyses are important as they can be used to develop a precise understanding of 
the behaviour of data compression algorithms (the Lempel—Ziv scheme); see Julien Fayolle’s 
contribution [204] for details. 2.0... 0... ccc cece een ne teen eee en eee enne || 


> IV.40. Multiple pattern occurrences. A similar analysis applies to the generating func- 


tion $ (s) (z) of words containing a fixed number s of occurrences of a pattern p. The OGF 
is obtained by expanding (with respect to uv) the BGF W(z, wu) obtained in Chapter III, p. 212, 
by means of an inclusion—exclusion argument. For s > 1, one finds 


N s-l 

8 = AMO pat -mac@, N@ = +(=mz(e@) = 1), 
D(z)st 

which now has a pole of multiplicity s + 1 at z = p. <i 


> IV.41. Patterns in Bernoulli sequences—asymptotics. Similar results hold when letters are 
assigned non-uniform probabilities, p; = P(a;), fora; ¢ A. The weighted autocorrelation 
polynomial is then defined by protrusions, as in Note II.39 (p. 213). Multiple pattern occur- 
rences can be also analysed. dq 


IV. 7. SINGULARITIES AND FUNCTIONAL EQUATIONS 275 


IV.7. Singularities and functional equations 


In the various combinatorial examples discussed so far in this chapter, we have 
been dealing with functions that are given by explicit expressions. Such situations 
essentially cover non-recursive structures as well as the very simplest recursive ones, 
such as Catalan or Motzkin trees, whose generating functions are expressible in terms 
of radicals. In fact, as we shall see extensively in this book, complex analytic methods 
are instrumental in analysing coefficients of functions implicitly specified by func- 
tional equations. In other words: the nature of a functional equation can often provide 
information regarding the singularities of its solution. Chapter V will illustrate this 
philosophy in the case of rational functions defined by systems of positive equations; 
a very large number of examples will then be given in Chapters VI and VII, where 
singularities that are much more general than poles are treated. 

In this section, we discuss three representative functional equations, 


f@=ze8, f@=z2+ f@?’+2), (OH eG 


associated, respectively, to Cayley trees, balanced 2-3 trees, and Polya’s alcohols. 
These illustrate the use of fundamental inversion or iteration properties for locating 
dominant singularities and derive exponential growth estimates of coefficients. 


IV.7.1. Inverse functions. We start with a generic problem already introduced 
on p. 249: given a function y analytic at a point yo with zo = w(yo) what can be said 
about its inverse, namely the solution(s) to the equation yw(y) = z when z is near zo 
and y near yo? 

Let us examine what happens when y’(yo) 4 0, first without paying attention to 
analytic rigour. One has locally (““~” means as usual “approximately equal’) 


(46) w(y) © wo) + w'Oo)(y — yo): 
so that the equation y(y) = z should admit, for z near zo, a solution satisfying 


1 
(47) yA yt Se 0): 
y'(yo) 


If this is granted, the solution being locally linear, it is differentiable, hence analytic. 
The Analytic Inversion Lemma! provides a firm foundation for such calculations. 
Lemma IV.2 (Analytic Inversion). Let y(z) be analytic at yo, with w(yo) = Zo. 
Assume that w'(yo) # 0. Then, for z in some small neighbourhood Qo of zo, there 
exists an analytic function y(z) that solves the equation yw(y) = z and is such that 
yo) = yo- 

Proof. (Sketch) The proof involves ideas analogous to those used to establish Rouché’s 
Theorem and the Argument Principle (see especially the argument justifying Equa- 
tion (42), p. 269). As a preliminary step, define the integrals (7 € Zs0) 


1 vw); 


(48) oj(Z) i= Dd GO =2 


> 


104 more general statement and several proof techniques are also discussed in Appendix B.5: Implicit 
Function Theorem, p. 753. 
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where y is a small enough circle centred at yo in the y-plane. 

First consider og. This function satisfies og(zo) = 1 [by the Residue Theorem] 
and is a continuous function of z whose value can only be an integer, this value being 
the number of roots of the equation y(y) = z. Thus, for z close enough to zo, one 
must have oo(z) = 1. In other words, the equation y(y) = z has exactly one solution, 
the function y is locally invertible and a solution y = y(z) that satisfies y(zo) = yo is 
well-defined. 

Next examine o1. By the Residue Theorem once more, the integral defining o1 (z) 
is the sum of the roots of the equation y(y) = z that lie inside +, that is, in our case, 
the value of y(z) itself. (This is also a particular case of Note IV.39, p. 270.) Thus, 
one has 01 (z) = y(z). Since the integral defining o;(z) depends analytically on z for 
z close enough to zg, analyticity of y(z) results. | 
> IV.42. Details. Let y be analytic in an open disc D centred at yo. Then, there exists a 
small circle y centred at yo and contained in D such that w(y) 4 yo on y. [Zeros of analytic 
functions are isolated, a fact that results from the definition of an analytic expansion]. The 
integrals o;(z) are thus well defined for z restricted to be close enough to zg, which ensures 


that there exists a d > O such that |y(y) — z| > 6 for all y © y. One can then expand the 
integrand as a power series in (z — zg), integrate the expansion termwise, and form in this way 
the analytic expansions of 09, a1 at zo. (This line of proof follows [334, I, §9.4].) <q 


> IV.43. Inversion and majorant series. The process corresponding to (46) and (47) can be 
transformed into a sound proof: first derive a formal power series solution, then verify that the 
formal solution is locally convergent using the method of majorant series (p. 250). dq 

The Analytic Inversion Lemma states the following: An analytic function locally 
admits an analytic inverse near any point where its first derivative is non-zero. How- 
ever, as we see next, a function cannot be analytically inverted in a neighbourhood of 
a point where its first derivative vanishes. 

Consider now a function y(y) such that y’(yo) = 0 but w’’(yo) 4 0, then, by the 
Taylor expansion of y, one expects 


1 
(49) w(y) © (0) + 5 = yo) w""(0). 
Solving formally for y now indicates a locally quadratic dependency 
2 
(y — yo)” © (< — zo), 
y"(yo) 


and the inversion problem admits two solutions satisfying 


2 
(50) y* yt} = Vi ~- Z0- 
V w’(yo) 


What this informal argument suggests is that the solutions have a singularity at zo, and, 
in order for them to be suitably specified, one must somehow restrict their domain of 
definition: the case of ,/z (the root(s) of y? — z = 0) discussed on p. 230 is typical. 

Given some point zo and a neighbourhood Q of Zo, the slit neighbourhood along 
direction @ is the set 


ON := {z €Q | arg(z — zo) #0 mod 27, z # zo}. 
We state: 
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Lemma IV.3 (Singular Inversion). Let y(y) be analytic at yo, with w(yo) = Zo. 
Assume that w'(yo) = 0 and w"(yo) # 0. There exists a small neighbourhood Qo 
of zo such that the following holds: for any fixed direction 0, there exist two functions, 
y1(z) and y2(z) defined on QY’ that satisfy w(y(z)) = z; each is analytic in Qn’, has 
a singularity at the point zo, and satisfies limz-5z, y(Z) = Yo. 

Proof. (Sketch) Define the functions o;(z) as in the proof of the previous lemma, 
Equation (48). One now has oo(z) = 2, that is, the equation w(y) = z possesses two 
roots near yo, when z is near zg. In other words y effects a double covering of a small 
neighbourhood Q of yo onto the image neighbourhood Qo = y(Q) 3 zo. By possibly 
restricting Q, we may furthermore assume that y’(y) only vanishes at yo in Q (zeros 
of analytic functions are isolated) and that © is simply connected. 


Fix any direction 6 and consider the slit neighbourhood oO, Fix a point ¢ in 
this slit domain; it has two preimages, 7, y2 € Q. Pick up the one named 7. Since 
y'(n1) is non-zero, the Analytic Inversion Lemma applies: there is a local analytic 
inverse y;(z) of y. This y;(z) can then be uniquely continued!! to the whole of OQ”, 
and similarly for y2(z). We have thus obtained two distinct analytic inverses. 

Assume a contrario that y;(z) can be analytically continued at zo. It would then 
admit a local expansion 

y1@) = Den — 20)", 
n>0 
while satisfying y(y1(z)) = z. But then, composing the expansions of y and y would 
entail 


win) =z0+ O(@-20)")  & > 20), 


which cannot coincide with the identity function (z). A contradiction has been reached. 
The point zo is thus a singular point for y; (as well as for y2). | 
> IV.44. Singular inversion and majorant series. In a way that parallels Note IV.43, the process 


summarized by Equations (49) and (50) can be justified by the method of majorant series, which 
leads to an alternative proof of the Singular Inversion Lemma. dq 


> IV.45. Higher order branch points. If all derivatives of y till order r — 1 inclusive vanish 
at yo, there are r inverses, yj (z),..., y-(z), defined over a slit neighbourhood of zo. dq 


Tree enumeration. We can now consider the problem of obtaining information 
on the coefficients of a function y(z) defined by an implicit equation 


(51) y@) = zb(y@), 

when ¢(u) is analytic at u = 0. In order for the problem to be well-posed (i.e., 
algebraically, in terms of formal power series, as well as analytically, near the origin, 
there should be a unique solution for y(z)), we assume that (0) 4 0. Equation (51) 
may then be rephrased as 


u 
(52) y(y@)=<z where ylu)= aa 
‘The fact of slitting Qo makes the resulting domain simply connected, so that analytic continuation 
becomes uniquely defined. In contrast, the punctured domain Qo \ {zg} is not simply connected, so that the 
argument cannot be applied to it. As a matter of fact, yj (z) gets continued to y2(z), when the ray of angle 0 
is crossed: the point z9 where two determinations meet is a branch point. 
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Figure IV.15. Singularities of inverse functions: ¢(u) = e” (left); wu) = u/d(u) 
(centre); y = Inv(y) (right). 


so that it is in fact an instance of the inversion problem for analytic functions. 

Equation (51) occurs in the counting of various types of trees, as seen in Subsec- 
tions I. 5.1 (p. 65), II.5.1 (p. 126), and III. 6.2 (p. 193). A typical case is é(u) = e”, 
which corresponds to labelled non-plane trees (Cayley trees). The function d(u) = 
(1+)? is associated to unlabelled plane binary trees and (wu) = 1+u-+u? to unary— 
binary trees (Motzkin trees). A full analysis was developed by Meir and Moon [435], 
themselves elaborating on earlier ideas of Polya [488, 491] and Otter [466]. In all 
these cases, the exponential growth rate of the number of trees can be automatically 
determined. 


Proposition IV.5. Let ¢ be a function analytic at 0, having non-negative Taylor co- 
efficients, and such that $(O) 4 0. Let R < +00 be the radius of convergence of the 
series representing & at 0. Under the condition, 


/ 
(53) fe ye 
x— R7 P(x) 
there exists a unique solution t € (O, R) of the characteristic equation, 
/ 
(54) cot) _ 
p(t) 


Then, the formal solution y(z) of the equation y(z) = z$(y(z)) is analytic at 0 and 
its coefficients satisfy the exponential growth formula: 


T 1 
g(t) p(t) 


Note that condition (53) is automatically realized as soon as (R~) = +00, which 
covers our earlier examples as well as all the cases where ¢ is an entire function (e.g., 
a polynomial). Figure IV.15 displays graphs of functions on the real line associated to 
a typical inversion problem, that of Cayley trees, where ¢(u) = e”. 


[z”] y(z) ex (<) where p= 


Proof. By Note IV.46 below, the function x¢’(x)/#(x) is an increasing function of x 
for x € (0, R). Condition (53) thus guarantees the existence and unicity of a solution 
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Type plu) (R) | t p Yn bs po" 
binary tree G+uy> (oy [1 2 | yp pad” (p67) 
Motzkin tree ltutu2 (co) | 1 5 Yn bx 3” (p. 68) 
1 
gen. Catalan tree (1) 5 j Yn ba 4” (p. 65) 
l-u 
Cayley tree el! (co) | 1 et! | yy ose” ~—(p. 128) 


Figure IV.16. Exponential growth for classical tree families. 


of the characteristic equation. (Alternatively, rewrite the characteristic equation as 
fo = $217 + 26317 +---, where the right side is clearly an increasing function.) 

Next, we observe that the equation y = zf(y) admits a unique formal power se- 
ries solution, which furthermore has non-negative coefficients. (This solution can for 
instance be built by the method of indeterminate coefficients.) The Analytic Inversion 
Lemma (Lemma IV.2) then implies that this formal solution represents a function, 
y(z), that is analytic at 0, where it satisfies y(O) = 0. 

Now comes the hunt for singularities and, by Pringsheim’s Theorem, one may 
restrict attention to the positive real axis. Let r < +00 be the radius of convergence 
of y(z) at O and set y(r) := lim,_,,- y(x), which is well defined (although possibly 
infinite), given positivity of coefficients. Our goal is to prove that y(r) = tT. 


— Assume a contrario that y(r) < t. One would then have w’(y(r)) 4 0. By 
the Analytic Inversion Lemma, y(z) would be analytic at r, a contradiction. 

— Assume a contrario that y(r) > t. There would then exist r* € (0,7) such 
that w’(y(r*)) = 0. But then y would be singular at r*, by the Singular 
Inversion Lemma, also a contradiction. 


Thus, one has y(r) = 7, which is finite. Finally, since y and y are inverse functions, 
one must have 


r= w(t) =t/¢(t) =p, 
by continuity as x > r~, which completes the proof. | 


Proposition IV.5 thus yields an algorithm that produces the exponential growth 
rate associated to tree functions. This rate is itself invariably a computable number 
as soon as ¢ is computable (i.e., its sequence of coefficients is computable). This 
computability result complements Theorem IV.8 (p. 251), which is relative to non- 
recursive structures only. 

As an example of application of Proposition IV.5, general Catalan trees corres- 
pond to ¢(y) = (1 — y)~!, whose radius of convergence is R = 1. The characteristic 
equation is t/(1 — t) = 1, which implies t = 1/2 and p = 1/4. We obtain (not a 
surprise!) y, > 4”, a weak asymptotic formula for the Catalan numbers. Similarly, 
for Cayley trees, d(u) = e” and R = +00. The characteristic equation reduces to 
(t — 1)e* = 0, so that t = 1 and p = e7|, giving a weak form of Stirling’s formula: 
[z”]y(z) = n"—! /n! ps e”. Figure IV.16 summarizes the application of the method to 
a few already encountered tree families. 
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As our previous discussion suggests, the dominant singularity of tree generating 
functions is, under mild conditions, of the square-root type. Such a singular behaviour 
can then be analysed by the methods of Chapter VI: the coefficients admit an asymp- 
totic form 


[Z"ly()~C- pn”, 


with a subexponential factor of the form n~*/?; see Section VI.7, p. 402. 


> IV.46. Convexity of GF's, Boltzmann models, and the Variance Lemma. Let $(z) be a 
non-constant analytic function with non-negative coefficients and a non-zero radius of con- 
vergence R, such that (0) # 0. For x € (0, R) a parameter, define the Boltzmann random 
variable & (of parameter x) by the property 


_ nx" =, _ (sx) 
= bG)’ with E(s7) = 56) 


the probability generating function of ©. By differentiation, the first two moments of © are 


(55) P(E =n) 


= xp! (x = x2 "(x xp! (x 
(a) 2%, gy PHO , OO 
p(x) p(x) p(x) 
There results, for any non-constant GF ¢, the general convexity inequality valid for0 < x < R: 
d / 
(56) ay Lee 
dx \ ¢(x) 


due to the fact that the variance of a non-degenerate random variable is always positive. Equiv- 
alently, the function log(¢(e’)) is convex for tf € (—oo, log R). (In statistical physics, a Boltz- 
mann model (of parameter x) corresponds to a class ® (with OGF ¢) from which elements 
are drawn according to the size distribution (55). An alternative derivation of (56) is given in 
Note VIIL.4, p. 550.) 


> IV.47. A variant form of the inversion problem. Consider the equation y = z+ ¢(y), where ¢ 


is assumed to have non-negative coefficients and be entire, with d(u) = O(u’) at u = 0. This 
corresponds to a simple variety of trees in which trees are counted by the number of their leaves 
only. For instance, we have already encountered labelled hierarchies (phylogenetic trees in 
Section II. 5, p. 128) corresponding to ¢(u) = e“ —1—u, which gives rise to one of “Schréder’s 
problems”. Let r be the root of ¢’(r) = 1 and set p = t — f(r). Then, [z”]y(z) bs p~”. For 
the EGF L of labelled hierarchies (L = z + e& — 1 — L), this gives Ly/n! pa (2log2 — 1)~”. 


(Observe that Lagrange inversion also provides [z”]y(z) = dT w" Ha-y déo)7") J 


IV.7.2. Iteration. The study of iteration of analytic functions was launched by 
Fatou and Julia in the first half of the twentieth century. Our reader is certainly aware 
of the beautiful images associated with the name of Mandelbrot whose works have 
triggered renewed interest in these questions, now classified as belonging to the field 
of “complex dynamics” [31, 156, 443, 473]. In particular, the sets that appear in this 
context are often of a fractal nature. Mathematical objects of this sort are occasionally 
encountered in analytic combinatorics. We present here the first steps of a classic 
analysis of balanced trees published by Odlyzko [459] in 1982. 


Example IV.12. Balanced trees. Consider the class € of balanced 2-3 trees defined as trees 
whose node degrees are restricted to the set {0, 2, 3}, with the additional property that all leaves 
are at the same distance from the root (Note I.67, p. 91). We adopt as notion of size the number 
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xo = 0.6 
8 xy = 0.576 
x2 = 0,522878976 
x3 = 0.416358802 
O: x4 = 0.245532388 
x5 =  0.075088357 
0. x6 = 0.006061629 
x7 =  0,000036966 
‘ xg =  0.000000001 
; xg =  1.867434390 x 10718 
x19 ==  3.487311201 x 10736 


ed 


Figure IV.17. The iterates of a point xg € (0, a) here x9 = 0.6, by o(z) = +23 
converge fast to 0. 


of leaves (also called external nodes), the list of all 4 trees of size 8 being: 


Kr &e KE ho 


Given an existing tree, a new tree is obtained by substituting in all possible ways to each external 
node (Q) either a pair (O, 0) or a triple (O, 0, O), and symbolically, one has 


E[O] = +e > (co+ )} 
In accordance with the specification, the OGF of € satisfies the functional equation 


(57) E@) =z+ E@’ +2), 
corresponding to the seemingly innocuous recurrence 
n 


k ; 
En= D5 (, 9g) Eh with Eo =0, E) =1. 
k=0 


Let o(z) = ae ra Equation (57) can be expanded by iteration in the ring of formal 
power series, 


(58) EQ) =z+o@)4+o")(Z 4+ ol 4---, 


where o‘/1(z) denotes the jth iterate of the polynomial o: ¢1(z) = z,¢M@+'(z) = gl] (6(z)) = 
a (alt) (z)). Thus, E(z) is nothing but the sum of all iterates of o. The problem is to determine 
the radius of convergence of E(z), and, by Pringsheim’s theorem, the quest for dominant sin- 
gularities can be limited to the positive real line. 
For z > 0, the polynomial o (z) has a unique fixed point, p = o (p), at 
1 14+J/5 
p=- where Q= 

Q 2 
is the golden ratio. Also, for any positive x satisfying x <p, the iterates c/](x) do converge 
to 0; see Figure IV.17. Furthermore, since o (z) ~ z near 0, these iterates converge to 0 doubly 
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Figure IV.18. Left: the fractal domain of analyticity of E(z) (inner domain in white 
and gray, with lighter areas representing slower convergence of the iterates of o) 
and its circle of convergence. Right: the ratio En / (g"n—!) plotted against log n for 
n=1..500 confirms that E, >< g” and illustrates the periodic fluctuations of (60). 


exponentially fast (Note IV.48). By the triangle inequality, we have |o (z)| < o(|z|), so that the 
sum in (58) is a normally converging sum of analytic functions, and is thus itself analytic for 
|z| < p. Consequently, E(z) is analytic in the whole of the open disc |z| < p. 

It remains to prove that the radius of convergence of E(z) is exactly equal to p. To that 
purpose it suffices to observe that E(z), as given by (58), satisfies 


E(x) > +00 as xp. 
Let N be an arbitrarily large but fixed integer. It is possible to select a positive xy sufficiently 


close to p with xy < p, such that the Nth iterate o!N] (xy) is larger than 5 (the function 
olN I(x) admits p as a fixed point and it is continuous and increasing at p). Given the sum 
expression (58), this entails the lower bound E(xy) > N for such an xy < p. Thus E(x) is 
unbounded as x — p~ and p is a singularity. 

The dominant positive real singularity of E(z) is thus p = o, and the Exponential 
Growth Formula gives the following estimate. 


Proposition IV.6. The number of balanced 2-3 trees satisfies: 


14/75)" 
5 , 


(59) [2"] E(z) os ( 


It is notable that this estimate could be established so simply by a purely qualitative exam- 
ination of the basic functional equation and of a fixed point of the associated iteration scheme. 

The complete asymptotic analysis of the Ey requires the full power of singularity analysis 
methods to be developed in Chapter VI. Equation (60) below states the end result, which in- 
volves fluctuations that are clearly visible on Figure IV.18 (right). There is overconvergence of 
the representation (58), that is, convergence in certain domains beyond the disc of convergence 
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of E(z). Figure IV.18 (left) displays the domain of analyticity of E(z) and reveals its fractal 
nature (compare with Figure VII.23, p. 536). 0.0... cee ccc eee eee eee | 


> IV.48. Quadratic convergence. First, for x € [0, 51, one has o (x) < 3x, so that oliI(x) < 


G/2)2'-! ron Second, for x € [0, A], where A is any number < p, there is a number k,4 such 
kk 

that olkal(x) < bs so that o!*1 (x) < (3/2) (3/4)? “Thus, for any A < p, the series of 

iterates of o is quadratically convergent when z € [0, A]. J 

> IV.49. The asymptotic number of 2—3 trees. This analysis is from [459, 461]. The number of 


2-3 trees satisfies asymptotically 
n 


(60) E, = ~o(ogn) +0 (5). 
n 


n 


where Q is a periodic function with mean value (g log(4— v))! = 0.71208 and period log(4— 
@) = 0.86792. Thus oscillations are inherent in E,,; see Figure IV.18 (right). <i 


IV. 7.3. Complete asymptotics of a functional equation. George Polya (1887- 
1985) is mostly remembered by combinatorialists for being at the origin of Pélya 
theory, a branch of combinatorics that deals with the enumeration of objects invariant 
under symmetry groups. However, in his classic article [488, 491] which founded 
this theory, Pélya discovered at the same time a number of startling applications of 
complex analysis to asymptotic enumeration!*. We detail one of these now. 


Example IV.13._ Polya’s alcohols. The combinatorial problem of interest here is the determi- 
nation of the number M,, of chemical isomeres of alcohols Cy H2,)4 OH without asymmetric 
carbon atoms. The OGF M(z) = >), Mnz” that starts as (ES A000621) 


(61) MQ) =14+2427 +22 +324 452° + 82° + 1427 + 2328 + 3927 ++, 
is accessible through a functional equation, 
1 
1—zM(?)" 
which we adopt as our starting point. Iteration of the functional equation leads to a continued 
fraction representation, 


(62) M(z) = 


1 
M(z) = = — =, 
Zz 


from which Pélya found: 
Proposition IV.7. Let M(z) be the solution analytic around 0 of the functional equation 
1 


M(z) = ———_~. 
) 1 —zM(z?) 
Then, there exist constants K, 6, and B > 1, such that 
M, = K - p” (1 + O(B~")) , B = 1.68136 75244, K = 0.36071 40971. 


121n many ways, Pélya can be regarded as the grandfather of the field of analytic combinatorics. 
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We offer two proofs. The first one is based on direct consideration of the functional equa- 
tion and is of a fair degree of applicability. The second one, following Pélya, makes explicit a 
special linear structure present in the problem. As suggested by the main estimate, the dominant 
singularity of M(z) is a simple pole. 
First proof. By positivity of the functional equation, M(z) dominates coefficientwise any 
GF (1 — <M<"(z7))—!, where M<!"(z) := DX0<j<m Mnz" is the mth truncation of M(z). In 
particular, one has the domination relation (use M <2(z) = 1 +z) 
M(zZ) = : 
= 
Since the rational fraction has its dominant pole at z = 0.68232, this implies that the radius p 
of convergence of M(z) satisfies p < 0.69. In the other direction, since M(z*) < M(z) 
for z € (0, p), then, one has the numerical inequality 


1 
MQ); O<z<p. 


2M (z)’ 
This can be used to show (Note IV.50) that the Catalan generating function C(z) = (1 — 
V1 — 4z)/(2z) is a majorant of M(z) on the interval (0, i which implies that M(z) is well 
defined and analytic for z € (0, i). In other words, one has i < p < 0.69. Altogether, the 
radius of convergence of M lies strictly between 0 and 1. 


> IV.50. Alcohols, trees, and bootstrapping. Since M(z) starts as 1 + z + z2 +--+ while 
C(z) starts as 1+ z+ 222 +--+, there is a small interval (0, €) such that M(z) < C(z). By 
the functional equation of M(z), one has M(z) < C(z) for z in the larger interval (0, /€). 


Bootstrapping then shows that M(z) < C(z) for z € (0, i): J 


Next, as z > p”, one must have zM (22) — 1. (Indeed, if this was not the case, we would 
have zM(z2) < A < 1 forsome A. But then, since p? < p, the quantity (1 —zM(z2))—! would 
be analytic at z = p, a clear contradiction.) Thus, p is determined implicitly by the equation 


pM(p*) = 1, O<p<l. 


One can then estimate » numerically (Note IV.51), and the stated value of 6 = 1/p follows. 
(Pélya determined p to five decimals by hand!) 
The previous discussion also implies that p is a pole of M(z), which must be simple (since 


82 (<M (z?) > 0). Thus 
zZ=p 


1 1 

oo a ap Leap’ io pM(p?) + 2p3M'(p?) 

The argument shows at the same time that M(z) is meromorphic in |z| < ./p = 0.77. That 
p is the only pole of M(z) on |z| = p results from the fact that 2M (z?) =z+7>4--- can 
be subjected to the type of argument encountered in the context of the Daffodil Lemma (see 
the discussion of quasi-inverses in the proof of Proposition IV.3, p. 267). The translation of the 
singular expansion (63) then yields the statement. 

> IV.51. The growth constant of molecules. The quantity p can be obtained as the limit of 
the pm satisfying 7" Mypatt! = 1, together with p € i 0.69]. In each case, only a 
few of the M;, (provided by the functional equation) are needed. One obtains: pjg = 0.595, 
p20 = 0.594756, p39 = 0.59475397, pag = 0.594753964. This algorithms constitutes a 
geometrically convergent scheme with limit p = 0.59475 39639. dq 
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Second proof. First, a sequence of formal approximants follows from (62) starting with 


1 1 1-22 1 1-22-24 
: ; ~ 1-722? p- 1 ze 4-25” 
1-22 f= 2 


1-z 


Zq 


which permits us to compute any number of terms of the series M(z). Closer examination 
of (62) suggests to set 


_ yw@’) 
TS ey 


where y(z) =1—z—227-74 49-28 479 4 7!0- z!64.... Back substitution into (62) 
yields 


WS ES We) 
v@) 1 _ wey ¥@) we?) -zwe4)’ 
y(<*) 


which shows y(z) to be a solution of the functional equation 


wz) = we?) -zwz4), — w) = 1. 


The coefficients of y satisfy the recurrence 


Wan = W2n> Want+l1 = —Yn> Wan42 = Y2n+1> Wan+3 = 9, 


which implies that their values are all contained in the set {0, —1, +1}. 

Thus, M(z) appears to be the quotient of two function, y(z2)/y(z), each analytic in the 
unit disc, and M(z) is meromorphic in the unit disc. A numerical evaluation then shows that 
y(z) has its smallest positive real zero at p = 0.59475, which is a simple root. The quantity p 
is thus a pole of M(z) (since, numerically, y(p) # 0). Thus 


2 2 1\2 
Nex y(p ) = ieee lt? ) () 
(z— p)y"(p) py'(p) \p 
Numerical computations then yield Pélya’s estimate. Et voila! ................ 0.0 e ee eee | 


The example of Pélya’s alcohols is exemplary, both from a historical point of 
view and from a methodological perspective. As the first proof of Proposition IV.7 
demonstrates, quite a lot of information can be pulled out of a functional equation 
without solving it. (A similar situation will be encountered in relation to coin foun- 
tains, Example V.9, p. 330.) Here, we have made great use of the fact that if f(z) is 
analytic in |z| < r and some a priori bounds imply the strict inequalities 0 <r < 1, 
then one can regard functions like f(z”), f(z>), and so on, as “known” since they are 
analytic in the disc of convergence of f and even beyond, a situation also evocative of 
our earlier discussion of Pélya operators in Section IV. 4, p. 249. Globally, the lesson 
is that functional equations, even complicated ones, can be used to bootstrap the local 
singular behaviour of solutions, and one can often do so even in the absence of any 
explicit generating function solution. The transition from singularities to coefficient 
asymptotics is then a simple jump. 
> IV.52. An arithmetic exercise. The coefficients y, = [z”]y(z) can be characterized simply 
in terms of the binary representation of n. Find the asymptotic proportion of the y, forn € 
[l. Ny that assume each of the values 0, +1, and —1. J 
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IV.8. Perspective 


In this chapter, we have started examining generating functions under a new light. 
Instead of being merely formal algebraic objects—power series—that encode ex- 
actly counting sequences, generating functions can be regarded as analytic objects— 
transformations of the complex plane—whose singularities provide a wealth of infor- 
mation concerning asymptotic properties of structures. 

Singularities provide a royal road to coefficient asymptotics. We could treat here, 
with a relatively simple apparatus, singularities that are poles. In this perspective, the 
two main statements of this chapter are the theorems relative to the expansion of ra- 
tional and meromorphic functions, (Theorems IV.9, p. 256, and IV.10, p. 258). These 
are classical results of analysis. Issai Schur (1875-1941) is to be counted among the 
very first mathematicians who recognized their r6le in combinatorial enumerations 
(denumerants, Example IV.6, p. 257). The complex analytic thread was developed 
much further by George Polya in his famous paper of 1937 (see [488, 491]), which 
Read in [491, p. 96] describes as a “landmark in the history of combinatorial analy- 
sis”. There, Pélya laid the groundwork of combinatorial chemistry, the enumeration 
of objects under group actions, and, last but not least, the complex asymptotic theory 
of graphs and trees. Thanks to complex analytic methods, many combinatorial classes 
amenable to symbolic descriptions can be thoroughly analysed, with regard to their 
asymptotic properties, by means of a selected collection of basic theorems of complex 
analysis. The case of structures such as balanced trees and molecules, where only a 
functional equation of sorts is available, is exemplary. 

The present chapter then serves as the foundation stone of a rich theory to be de- 
veloped in future chapters. Chapter V will elaborate on the analysis of rational and 
meromorphic functions, and present a coherent theory of paths in graphs, automata, 
and transfer matrices in the perspective of analytic combinatorics. Next, the method 
of singularity analysis developed in Chapter VI considerably extends the range of ap- 
plicability of the Second Principle to functions having singularities appreciably more 
complicated that poles (e.g., those involving fractional powers, logarithms, iterated 
logarithms, and so on). Applications will be given to recursive structures, including 
many types of trees, in Chapter VII. Chapter VII, dedicated to saddle-point methods 
will then complete the picture of univariate asymptotics by providing a unified treat- 
ment of counting GFs that are either entire functions (hence, have no singularity at a 
finite distance) or manifest a violent growth at their singularities (hence, fall outside 
of the scope of meromorphic or singularity-analysis asymptotics). Finally, in Chap- 
ter IX, the corresponding perturbative methods will be put to use in order to distil limit 
laws for parameters of combinatorial structures. 


Bibliographic notes. This chapter has been designed to serve as a refresher of basic com- 
plex analysis, with special emphasis on methods relevant for analytic combinatorics. See Fig- 
ure IV.19 for a concise summary of results. References most useful for the discussion given 
here include the books of Titchmarsh [577] (oriented towards classical analysis), Whittaker and 
Watson [604] (stressing special functions), Dieudonné [165], Hille [334], and Knopp [373]. 
Henrici [329] presents complex analysis under the perspective of constructive and numerical 
methods, a highly valuable point of view for this book. 
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Basics. The theory of analytic functions benefits from the equivalence between two no- 
tions, analyticity and differentiability. It is the basis of a powerful integral calculus, much 
different from its real variable counterpart. The following two results can serve as “axioms” of 
the theory. 


THEOREM IV.1 [Basic Equivalence Theorem] (p. 232): Two fundamental notions are equiv- 
alent, namely, analyticity (defined by convergent power series) and holomorphy (defined by 
differentiability). Combinatorial generating functions, a priori determined by their expansions 
at 0 thus satisfy the rich set of properties associated with these two equivalent notions. 
THEOREM IV.2 [Null Integral Property] (p. 234): The integral of an analytic function along a 
simple loop (closed path that can be contracted to a single point) is 0. Consequently, integrals 
are largely independent of particular details of the integration contour. 

Residues. For meromorphic functions (functions with poles), residues are essential. Co- 
efficients of a function can be evaluated by means of integrals. The following two theorems 
provide connections between local properties of a function (e.g., coefficients at one point) and 
global properties of the function elsewhere (e.g., an integral along a distant curve). 


THEOREM IV.3 [Cauchy’s residue theorem] (p. 234): In the realm of meromorphic functions, 
integrals of a function can be evaluated based on local properties of the function at a few specific 
points, its poles. 
THEOREM IV.4 [Cauchy’s Coefficient Formula] (p. 237): This is an almost immediate conse- 
quence of Cauchy’s residue theorem: The coefficients of an analytic function admit of a repre- 
sentation by a contour integral. Coefficients can then be evaluated or estimated using properties 
of the function at points away from the origin. 

Singularities and growth. Singularities (places where analyticity stops), provide essential 
information on the growth rate of a function’s coefficients. The “First Principle” relates the 
exponential growth rate of coefficients to the location of singularities. 


THEOREM IV.5 [Boundary singularities] (p. 240): A function (given by its series expansion 
at 0) always has a singularity on the boundary of its disc of convergence. 
THEOREM IV.6 [Pringsheim’s Theorem] (p. 240): This theorem refines the previous one for 
functions with non-negative coefficients. It implies that, in the case of combinatorial generating 
functions, the search for a dominant singularity can be restricted to the positive real axis. 
THEOREM IV.7 [Exponential Growth Formula] (p. 244): The exponential growth rate of co- 
efficients is dictated by the /ocation of the singularities nearest to the origin—the dominant 
singularities. 
THEOREM IV.8 [Computability of growth] (p. 251): For any combinatorial class that is non- 
recursive (iterative), the exponential growth rate of coefficients is invariably a computable num- 
ber. This statement can be regarded as the first general theorem of analytic combinatorics. 
Coefficient asymptotics. The “Second Principle” relates subexponential factors of coef- 
ficients to the nature of singularities. For rational and meromorphic functions, everything is 
simple. 
THEOREM IV.9 [Expansion of rational functions] (p. 256): Coefficients of rational functions 
are explicitly expressible in terms of the poles, given their location (values) and nature (multi- 
plicity). 
THEOREM IV.10 [Expansion of meromorphic functions] (p. 258): Coefficients of meromorphic 


functions admit of a precise asymptotic form with exponentially small error terms, given the 
location and nature of the dominant poles. 


Figure IV.19. A summary of the main results of Chapter IV. 
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De Bruijn’s classic booklet [143] is a wonderfully concrete introduction to effective asymp- 
totic theory, and it contains many examples from discrete mathematics thoroughly worked out 
using a complex analytic approach. The use of such analytic methods in combinatorics was pi- 
oneered in modern times by Bender and Odlyzko, whose first publications in this area go back 
to the 1970s. The state of affairs in 1995 regarding analytic methods in combinatorial enumer- 
ation is superbly summarized in Odlyzko’s scholarly chapter [461]. Wilf devotes Chapter 5 of 
his Generatingfunctionology [608] to this question. The books by Hofri [335], Mahmoud [429], 
and Szpankowski [564] contain useful accounts in the perspective of analysis of algorithms. See 
also our book [538] for a light introduction and the chapter by Vitter and Flajolet [598] for more 
on this specific topic. 


Despite all appearances they [generating functions] belong to algebra and not to analysis. 


Combinatorialists use recurrence, generating functions, and such transformations as the 
Vandermonde convolution; others to my horror, use contour integrals, 
differential equations, and other resources of mathematical analysis. 


— JOHN RIORDAN [513, p. viii] and [512, Pref.] 


V 


Applications of Rational and 
Meromorphic Asymptotics 


Analytic methods are extremely powerful and when they apply, 
they often yield estimates of unparalleled precision. 


— ANDREW ODLYZKO [461] 
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The primary goal of this chapter is to provide combinatorial illustrations of the power 
of complex analytic methods, and specifically of the rational-meromorphic frame- 
work developed in the previous chapter. At the same time, we shift gears and envisage 
counting problems at a new level of generality. Precisely, we organize combinatorial 
problems into wide families of combinatorial types amenable to a common treatment 
and associated with a common collection of asymptotic properties. Without attempt- 
ing a formal definition, we call schema any such family determined by combinatorial 
and analytic conditions that covers an infinity of combinatorial classes. 

First, we discuss a general schema of analytic combinatorics known as the su- 
percritical sequence schema, which provides a neat illustration of the power of mero- 
morphic asymptotics (Theorem IV.10, p. 258), while being of wide applicability. This 
schema unifies the analysis of compositions, surjections, and alignments; it applies to 
any class which is defined as a sequence, provided components satisfy a simple ana- 
lytic condition (“supercriticality”). For instance, one can predict very precisely (and 
easily) the number of ways in which an integer can be decomposed additively as a 
sum of primes (or twin primes), this even though many details of the distribution of 
primes are still surrounded in mystery. 

The next schema comprises regular specifications and languages, which a priori 
lead to rational generating functions and are thus systematically amenable to Theo- 
rem IV.9 (p. 256), to the effect that coefficients are described as exponential poly- 
nomials. In the case of regular specifications, much additional structure is present, 
especially positivity. Accordingly, counting sequences are of a simple exponential— 
polynomial form and fluctuations can be systematically circumvented. Applications 
presented in this chapter include the analysis of longest runs, attached to maximal 
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sequences of good (or bad) luck in games of chance, pure birth processes, and the 
occurrence of hidden patterns (subsequences) in random texts. 

We then consider an important subset of regular specifications, corresponding to 
nested sequences, that combinatorially describe a variety of lattice paths. Such nested 
sequences naturally lead to nested quasi-inverses, which are none other than continued 
fractions. A wealth of combinatorial, algebraic, and analytic properties then surround 
such constructions. A prime illustration is the complete analysis of height in Dyck 
paths and general Catalan trees; other interesting applications relate to coin fountain 
and interconnection networks. 

Finally, the last two sections examine positive linear systems of generating func- 
tions, starting with the simplest case of finite graphs and automata, and concluding 
with the general framework of transfer matrices. Although the resulting generating 
functions are once more bound to be rational, there is benefit in examining them as 
defined implicitly (rather than solving explicitly) and work out singularities directly. 
The spectrum of matrices (the set of eigenvalues) then plays a central réle. An im- 
portant case is the irreducible linear system schema, which is closely related to the 
Perron—Frobenius theory of non-negative matrices, whose importance has been long 
recognized in the theory of finite Markov chains. A general discussion of singularities 
can then be conducted, leading to valuable consequences on a variety of models— 
paths in graphs, finite automata, and transfer matrices. The last example discussed 
in this chapter treats locally constrained permutations, where rational functions com- 
bined with inclusion—exclusion provide an entry to the world of value-constrained 
permutations. 


In the various combinatorial examples encountered in this chapter, the generating 
functions are meromorphic in some domain extending beyond their disc of conver- 
gence at 0. As a consequence, the asymptotic estimates of coefficients involve main 
terms that are explicit exponential—polynomials and error terms that are exponentially 
smaller. This is a situation well summarized by Odlyzko’s aphorism quoted on p. 289: 
“Analytic methods [... ] often yield estimates of unparalleled precision’. 


V.1. A roadmap to rational and meromorphic asymptotics 


The key character in this chapter is the combinatorial sequence construction SEQ. 
Since its translation into generating functions involves a quasi-inverse, (1 — f)~', the 
construction should in many cases be expected to induce polar singularities. Also, 
linear systems of equations, of which the simplest case is X = 1 + AX, are solvable 
by means of inverses: the solution is X = (1 — A)~! in the scalar case, and it is oth- 
erwise expressible as a quotient of determinants (by Cramer’s rule) in the matrix case. 
Consequently, linear systems of equations are also conducive to polar singularities. 

This chapter accordingly develops along two main lines. First, we study non- 
recursive families of combinatorial problems that are, in a suitable sense, driven by a 
sequence construction (Sections V. 2—V. 4). Second, we examine families of recursive 
problems that are naturally described by linear systems of equations (Sections V. 5— 
V.6). Clearly, the general theorems giving the asymptotic forms of coefficients of 
rational and meromorphic functions apply. As we shall see, the additional positivity 
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structure arising from combinatorics entails notable simplifications in the asymptotic 
form of counting sequences. 


The supercritical sequence schema. This schema, fully described in Section V. 2 
(p. 293) corresponds to the general form F = SEQ(G), together with a simple an- 
alytic condition, “supercriticality”, attached to the generating function G(z) of G. 
Under this condition, the sequence (F;,) happens to be predictable and an asymptotic 
estimate, 


(1) F,=cB"+0(B"), O0<B<f, ceERso, 


applies with 6 such that G(1/f) = 1. Integer compositions, surjections, and align- 
ments presented in Chapters I and II can then be treated in a unified manner. The 
supercritical sequence schema even covers situations where G is not necessarily con- 
structible: this includes compositions into summands that are prime numbers or twin 
primes. Parameters, like the number of components and more generally profiles, are 
under these circumstances governed by laws that hold with a high probability. 


Regular specification and languages. This topic is treated in Section V. 3 (p. 300). 
Regular specifications are non-recursive specifications that only involve the construc- 
tions (+, x, SEQ). In the unlabelled case, they can always be interpreted as describing 
a regular language in the sense of Section I. 4, p. 49. The main result here is the fol- 
lowing: given a regular specification F, it is possible to determine constructively a 
number D, so that an asymptotic estimate of the form 


(2) Ry, = P(n)p" + O(B"), 0<B<£, Papolynomial, 


holds, once the index n is restricted to a fixed congruence class modulo D. (Naturally, 
the quantities P, 8, B may depend on the particular congruence class considered.) In 
other words, a “pure” exponential polynomial form holds for each of the D “sections” 
[subsequences defined on p. 302] of the counting sequence (R,)n>0. In particular, ir- 
regular fluctuations, which might otherwise arise from the existence of several domi- 
nant poles sharing the same modulus but having incommensurable arguments (see the 
discussion in Subsection IV. 6.1, p. 263 dedicated to multiple singularities), are simply 
not present in regular specifications and languages. Similar estimates hold for profiles 
of regular specifications, where the profile of an object is understood to be the number 
of times any fixed construction is employed. 


Nested sequences, lattice paths, and continued fractions. The material consid- 
ered in Section V. 4 (p. 318) could be termed the SEQ o---oSEQ schema, correspond- 
ing to nested sequences. The associated GFs are chains of quasi-inverses; that is, 
continued fractions. Although the general theory of regular specifications applies, the 
additional structure resulting from nested sequences implies, in essence, uniqueness 
and simplicity of the dominant pole, resulting directly in an estimate of the form 


(3) Sn = cB" + O(B"), 0<B<f, ceRso, 
for objects enumerated by nested sequences. This schema covers lattice paths of 
bounded height, their weighted versions, as well as several other bijectively equivalent 


classes, like interconnection networks. In each case, profiles can be fully character- 
ized, the estimates being of a simple form. 
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Paths in graphs and automata. The framework of paths in directed graphs ex- 
pounded in Section V.5 (p. 336) is of considerable generality. In particular, it covers 
the case of finite automata introduced in Subsection I. 4.2, p. 56. Although, in the 
abstract, the descriptive power of this framework is formally equivalent to the one of 
regular specifications (Appendix A.7: Regular languages, p. 733), there is great ad- 
vantage in considering directly problems whose natural formulation is recursive and 
phrased in terms of graphs or automata. (The reduction of automata to regular ex- 
pressions is non-trivial so that it does not tend to preserve the original combinatorial 
structure.) The algebraic theory is that of matrices of the form (J — zT)~!, where T 
is a matrix with non-negative entries. The analytic theory behind the scene is now that 
of positive matrices and the companion Perron—Frobenius theory. Uniqueness and 
simplicity of dominant poles of generating functions can be guaranteed under easily 
testable structural conditions—principally, the condition of irreducibility that corres- 
ponds to a strong connectedness of the system. Then a pure exponential polynomial 
form holds, 


(4) Co~ ch" +O(A"), O<AK<A, CERso, 


where 4, is the (unique) dominant eigenvalue of the transition matrix T. Applications 
include walks over various types of graphs (the interval graph, the devil’s staircase) 
and words excluding one or several patterns (walks on the De Bruijn graph). 


Transfer matrices. This framework, whose origins lie in statistical physics, is an 
extension of automata and paths in graphs. What is retained is the notion of a finite 
state system, but transitions can now take place at different speeds. Algebraically, one 
is dealing with matrices of the form (J — T(z))~!, where T is a matrix whose entries 
are polynomials (in z) with non-negative coefficients. Perron—Frobenius theory can 
be adapted to cover such cases, that, to a probabilist, look like a mixture of Markov 
chain and renewal theory. The consequence, for this category of models, is once more 
an estimate of the type (4), under irreducibility conditions; namely 


(5) Dn ~ cui + O(M"), O<M <4, c € Ryo, 


where “1 = 1/o and a is the smallest positive value of z such that T(z) has dominant 
eigenvalue 1. A striking application of transfer matrices is a study, with an experi- 
mental mathematics flavour, of self-avoiding walks and polygons in the plane: it turns 
out to be possible to predict, with a high degree of confidence (but no mathemati- 
cal certainty, yet), what the number of polygons is and which distribution of area is 
to be expected. A combination of the transfer matrix approach with a suitable use 
of inclusion—exclusion (Subsection V. 6.4, p. 367) finally provides a solution to the 
classic ménage problem of combinatorial theory as well as to many related questions 
regarding value-constrained permutations. 


Browsing notes. We, authors, recommend that our gentle reader first gets a bird’s 
eye view of this chapter, by skimming through sections, before descending to ground 
level and studying examples in detail—some of the latter are indeed somewhat tech- 
nically advanced (e.g., they make use of Mellin transforms and/or develop limit laws). 
The contents of this chapter are not needed for Chapters VI-VIII, so that the reader 
who is impatient to penetrate further the logic of analytic combinatorics can at any 
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time have a peek at Chapters VI-VII. We shall see in Chapter IX (specifically, 
Section IX.6, p. 650) that all the schemas considered here are, under simple non- 
degeneracy conditions, associated to Gaussian limit laws. 

Sections V. 2 to V.6 are organized following a common pattern: first, we discuss 
“combinatorial aspects”, then “analytic aspects”, and finally “applications”. Each of 
Sections V.2 to V.5 is furthermore centred around two analytic-combinatorial theo- 
rems, one describing asymptotic enumeration, the other quantifying the asymptotic 
profiles of combinatorial structures. We examine in this way the supercritical se- 
quence schema (Section V.2), general regular specifications (Section V.3), nested 
sequences (Section V. 4), and path-in-graphs models (Section V.5). The last section 
(Section V.6) departs slightly from this general pattern, since transfer matrices are 
reducible rather simply to the framework of paths in graphs and automata, so that we 
do not need specifically new statements. 


V.2. The supercritical sequence schema 


This schema is combinatorially the simplest treated in this chapter, since it plainly 
deals with the sequence construction. An auxiliary analytic condition, named “super- 
criticality” ensures that meromorphic asymptotics applies and entails strong statistical 
regularities. The paradigm of supercritical sequences unifies the asymptotic properties 
of a number of seemingly different combinatorial types, including integer composi- 
tions, surjections, and alignments. 


V.2.1. Combinatorial aspects. We consider a sequence construction, which may 
be taken in either the unlabelled or the labelled universe. In either case, we have 


1 
F = SEQ(G) = F(@)= 16@" 


with G(0) = 0. It will prove convenient to set 
fn =(2"1F@),  — 8n = Lz" G(), 


so that the number of 7, structures is f;, in the unlabelled case and n! f;, otherwise. 
From Chapter III, the BGF of F-structures with u marking the number of G— 
components is 


1 
6 F=SE = F(z, u) = ——.. 
(6) Qug) CNet en 
We also have access to the BGF of F with u marking the number of G,—components: 


1 
1 — (G(z) + (u = Vgez*) © 


1) F = Sra (UG +(G\G%) = F@w = 


V.2.2. Analytic aspects. We restrict attention to the case where the radius of 
convergence p of G(z) is non-zero, in which case, the radius of convergence of F(z) 
is also non-zero by virtue of closure properties of analytic functions. Here is the basic 
concept of this section. 
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Definition V.1. Let F, G be generating functions with non-negative coefficients that 
are analytic at 0, with G(O) = 0. The analytic relation F(z) = (1 — G(z))7! is 
said to be supercritical if G(p) > 1, where p = pg is the radius of convergence 
of G. A combinatorial schema F = SEQ(G) is said to be supercritical if the relation 
F(z) = (1—G(z))7! between the corresponding generating functions is supercritical. 


Note that G (p) is well defined in RU{+ 00} as the limit lim,_, ,- G(x) since G(x) 
increases along the positive real axis, for x € (0, p). (The value G(p) corresponds 
to what has been denoted earlier by tg when discussing “signatures” in Section IV. 4, 
p. 249.) From now on we assume that G(z) is strongly aperiodic in the sense that there 
does not exist an integer d > 2 such that G(z) = h(z) for some h analytic at 0. (Put 
otherwise, the span of 1 + G(z), as defined on p. 266, is equal to 1.) This condition 
entails no loss of analytic generality. 


Theorem V.1 (Asymptotics of supercritical sequence). Let the schema F = SEQ(G) 
be supercritical and assume that G(z) is strongly aperiodic. Then, one has 


[2"] F(Z) = o"(1+ 0(A")), 


1 
oG'(c) , 
where o is the root in (0, pg) of G(o) = 1 and A is a number less than 1. The 
number X of G-components in a random F-structure of size n has mean and variance 


satisfying 


3 = 1 a G’"(c) 7 
mn(X) = oG"(a) (n : Ly] "Go + O(A") 
V(X) = oG"(o)+G'(o) —0G'(c) n+ O(1). 


o*G'(a)3 
In particular, the distribution of X on Fy is concentrated. 


Proof. See also [260, 547]. The basic observation is that G increases continuously 
from G(0) = 0 to G(pg) = tg (with tg > 1 by assumption) when x increases from 
0 to pg. Therefore, the positive number o, which satisfies G(o) = 1 is well defined. 
Then, F is analytic at all points of the interval (0, 0). The function G being analytic 
at o, satisfies, in a neighbourhood of o 


G(z) =14+ G(e)\(z-—0) + HON ee ee 


so that F(z) has a pole at z = o; also, this pole is simple since G’(a) > 0, by 
positivity of the coefficients of G. Thus, we have 


Re 1 - 1 1 

zap Gi(o\(z-0a) oG'(ao)1-Zz/o 

Pringsheim’s theorem (Theorem IV.6, p. 240) then implies that the radius of conver- 
gence of F must coincide with o. 

There remains to show that F(z) is meromorphic in a disc of some radius R > o 
with the point o as the only singularity inside the disc. This results from the assump- 
tion that G is strongly aperiodic. In effect, as a consequence of the Daffodil Lemma 
(Lemma IV.3, p. 267), one has G(ae'”) 4 1, for all 6 # 0 (mod 27) . Thus, by 
compactness, there exists a closed disc of radius R > o in which F is analytic except 
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for a unique pole at 7. We can now apply the main theorem of meromorphic function 
asymptotics (Theorem IV.10, p. 258) to deduce the stated formula with A = a/R. 

Next, the number of G-components in a random ¥F structure of size n has BGF 
given by (6), and by differentiation, we get 


1 a 1 1, Ge) 


in(X) = Ti Iz] éu1—uG@|,_, = Ta [z (1 — G(z))?" 


The problem is now reduced to extracting coefficients in a univariate generating func- 
tion with a double pole at z = o, and it suffices to expand the GF locally at a: 


G(z) 1 7 1 1 
(L- G2)? 0 GeP—a)* — o?G(e)? (1 — z/a)? 
The variance calculation is similar, with a triple pole being involved. | 


When a sequence construction is supercritical, the number of components is in 
the mean of order n while its standard deviation is O(./n). Thus, the distribution is 
concentrated (in the sense of Section III.2.2, p. 161). In fact, there results from a 
general theorem of Bender [35] that the distribution of the number of components is 
asymptotically Gaussian, a property to be established in Section IX. 6, p. 650. 


Profiles of supercritical sequences. We have seen in Chapter III that integer 
compositions and integer partitions, when sampled at random, tend to assume rather 
different aspects. Given a sequence construction, F = SEQ(G), the profile of an 
element a € F is the vector (X‘!), X'?),...) where X‘/)(a) is the number of G- 
components in @ that have size j. In the case of (unrestricted) integer compositions, 
it could be proved elementarily (Example II.6, p. 167) that, on average, for size n, 
the number of 1-summands is ~ n/2, the number of 2-summands is ~ n/4, and so 
on. Now that meromorphic asymptotics is available, such a property can be placed in 
a much wider perspective. 

Theorem V.2 (Profiles of supercritical sequences). Consider a supercritical sequence 
construction, F = SEQ(G), with G(z) strongly aperiodic, as in Theorem V.1. The 
number of G-components of any fixed size k in a random F—object of size n satisfies 


gno* 
(8) Oo = sore t OW: ViA(X") = O(n), 


where o in (0, 0G) is such that G(o) = 1, and gx = [z*|G(z). 


Proof. The BGF with u marking the number of G—components of size k is given in (7). 
The mean value is then obtained as a quotient, 


1 é 1 get 
n(X) = —[2"] —F@w)) = Ie") ; 
fn ou wi on UL = GCG) 
The GF of cumulated values has a double pole at z = o, and the estimate of the mean 
value follows. The variance is estimated similarly, after two successive differentiations 
and the analysis of a triple pole. | 


The total number of components X satisfies X = >” X (k) and, by Theorem V.1, 
its mean is asymptotic to n/(aG’(c)). Thus, Equation (8) indicates that, at least 
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in some average-value sense, the “proportion” of components of size k among all 
components is given by gxa*. 
> V.1. Proportion of k-components and convergence in probability. For any fixed k, the random 


variable X A) /Xn converges in probability to the value grok, 


x)» yh) 
ee ee oe jim, P geo*(1—€) < S— < geo* +6) =1, 
n 


for any € > 0. The proof is an easy consequence of the Chebyshev inequalities (the distributions 


of X, and X (k) are both concentrated). <i 


V.2.3. Applications. We examine here two types of applications of the super- 
critical sequence schema. Example V.1 makes explicit the asymptotic enumeration 
and the analysis of profiles of compositions, surjections and alignments. What stands 
out is the way the mean profile of a structure reflects the underlying inner construc- 
tion & in schemas of the form SEQ(A(Z)). Example V.2 discusses compositions into 
restricted summands, including the striking case of compositions into primes. 


Example V.1. Compositions, surjections, and alignments. The three classes of interest here 
are integer compositions (C), surjections (72) and alignments (), which are specified as 


C = SEQ(SEQs(2)), R = SEQ(SETS|(2Z)), O = SEQ(CYC(Z)) 
and belong to either the labelled universe (C) or to the labelled universe (R and QO). The 
generating functions (of type OGF, EGF, and EGF, respectively) are 
1 


1 
Ce cats BEG T= log(l = 2)! 


1 
— Tz =(= 1)’ 
A direct application of Theorem V.1 (p. 294) gives us back the known results 


1 1 1 
C= gro Rn ay 5 (log2)-"—", On _ e (1 = a mee 


O(z) = 


corresponding to o equal to i, log 2, and 1 — el, respectively. 

Similarly, the expected number of summands in a random composition of the integer n 
is ~ n/2; the expected cardinality of the range of a random surjection whose domain has 
cardinality n is asymptotic to Bn with 6 = 1/(2log2); the expected number of components in 
a random alignment of size n is asymptotic to n/(e — 1). 

Theorem V.2 also applies, providing the mean number of components of size k in each 
case. The following table summarizes the conclusions. 


Structures specification law (gra*) type oO 
Compositions SEQ(SEQ>1(Z)) xt Geometric ; 
Surjections SEQ(SETs1(Z)) a (log aye Poisson log 2 
Alignments SEQ(CYC(Z)) -( - ge lyk Logarithmic | 1 — a 


Note that the stated laws necessitate k > 1. The geometric and Poisson law are classical; the 
logarithmic distribution (also called “logarithmic-series distribution”) of a parameter 2 > 0 is 
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Figure V.1. Profile of structures drawn at random represented by the sizes of their 
components in sorted order: (from left to right) a random composition, surjection, 
and alignment of size n = 100. 


by definition the law of a discrete random variable Y such that 

I ar 
—_—_—___—,, k>1. 
log(1 —A)~! k 


The way the internal construction & in the schema SEQ(&(Z)) determines the asymptotic pro- 
portion of component of each size, 


PY =k)= 


Sequence +» Geometric; Set > Poisson; Cycle +> Logarithmic, 


stands out. Figure V.1 exemplifies the phenomenon by displaying components sorted by size 
and represented by vertical segments of corresponding lengths for three randomly drawn objects 
OE SIZE 100.01) or Aer ae Ak tne Ain eS Book he Ne ee ee hb Ee RA dB | 


Example V.2.. Compositions with restricted summands, compositions into primes. Unre- 
stricted integer compositions are well understood as regards enumeration: their number is ex- 
actly Cn = 2-1 their OGF is C(z) = 0 — z)/(1 — 2z), and compositions with k summands 
are enumerated by binomial coefficients. Such simple exact formulae disappear when restricted 
compositions are considered, but, as we now show, asymptotics is much more robust to changes 
in specifications. 

Let S be a subset of the integers Z,1 such that gcd(S) = 1, ie., not all members of S are 
multiples of a common divisor d > 2. In order to avoid trivialities, we also assume that S has at 
least two elements. The class C* of compositions with summands constrained to the set S then 
satisfies: 


1 
CS =SEQ(SEQ5(Z)) = — C2) = Tosm’ S@= aes 
‘ seS 
By assumption, S(z) is strongly aperiodic, so that Theorem V.1 (p. 294) applies directly. There 
is a well-defined number o such that 
S(o) = 1, 0<o <1, 
and the number of S-restricted compositions satisfies 
1 
a S'(a) 
Among the already discussed cases, S = {1,2} gives rise to Fibonacci numbers F, and, more 
generally, S = {1,...,r} corresponds to partitions with summands at most r. In this case, the 


(9) C8 :=[z"\c%(2) = 


o-" (1+ O(A")). 
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10 16|15 

20 732|734 

30 36039 | 360 57 

40 1772207 | 17722 61 

50 87109263 | 871092 48 

60 4281550047 | 42815 49331 

70 210444532770 | 21044453 0095 

80 10343662267 187 | 1034366226 5182 
90 5084064 14757253 | 5084064147 81706 
100 24988932929490838 | 24988932929 612479 


Figure V.2. The pyramid relative to compositions into prime summands for n = 
10.. 100: (left: exact values; right: asymptotic formula rounded). 


OGF, 
1 1-z 


1-2te — 1-224 24! 


CllLorl (z) = 


is a simple variant of the OGF associated to longest runs in strings, which is studied at length 
in Example V.4, p. 308. The treatment of the latter can be copied almost verbatim to the effect 
that the largest component in a random composition of n is found to be logy n + O(1), both on 
average and with high probability. 


Compositions into primes. Here is a surprising application of the general theory. Consider 
the case where S is taken to be the set of prime numbers, Prime = {2,3,5,7,11,...}, thereby 
defining the class of compositions into prime summands. The sequence starts as 


1,0, 1, 1,1, 3,2, 6,6, 10, 16, 20, 35, 46, 72, 105, 


corresponding to G(z) = z2+z34+¢°+---, and is EJS A023360 in Sloane’s Encyclopedia. The 
formula (9) provides the asymptotic shape of the number of such compositions (Figure V.2). It 
is also worth noting that the constants appearing in (9) are easily determined to great accuracy, 
as we now explain. 

By (9) and the preceding equation, the dominant singularity of the OGF of compositions 
into primes is the positive root 0 < 1 of the characteristic equation 


S(z) = 2S zP=1. 


p Prime 


Fix a threshold value mo (for instance mg = 10 or 100) and introduce the two series 


ST (z) = > J, S*(z) = os a 4 — : : 


seS, s<mo seS, s<mo 


Clearly, for x € (0, 1), one has S~ (x) < S(x) < St(x). Define then two constants o~, 0+ by 
the conditions 

S-(o)=1, St(et)=1, O0<o ,ot <1. 
These constants are algebraic numbers that are accessible to computation. At the same time, 
they satisfy 0+ < o <7. As the order of truncation, mo, increases, the values of 0+, o~ 
provide better and better approximations to o, together with an interval in which o provably 
lies. For instance, mg = 10 is enough to determine that 0.66 < o < 0.69, and the choice 
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80000 4 100000 4 
60000 4 80000 | 
40000 4 60000 4 
68000 40000 4 

| |. 20000 4 

0 4 
-20000 | 7 

-20000 4 

40000 | Hike 

=60000'7 60000 4 

100000 4 100000 4 

120000 4 120000 4 


Figure V.3. Errors in the approximation of the number of compositions into primes 
for n = 70..100: left, the values of chime — g(n); right, the correction arising 
from the next two poles, which are complex conjugate, and its continuous extrapola- 
tion go(n), for n € [70, 100]. 


mg = 100 gives o to 15 guaranteed digits of accuracy, namely, 0 = 0.67740 17761 30660. 
Then, the asymptotic formula (9) instantiates as 


(io) chrime ~ o(n), — g(n) := A+B", 2 = 0.3036552633, f = 1.47622 87836. 


(The constant = o~! = 1.47622 is akin to the family of Backhouse constants described 
in [211].) 

Once more, the asymptotic approximation is very good, as is exemplified by the “pyramid” 
of Figure V.2. The difference between ce Time and its approximation g(n) from Equation (10) is 
plotted on the left-hand part of Figure V.3. The seemingly haphazard oscillations that manifest 
themselves are well explained by the principles discussed in Section IV. 6.1 (p. 263). It appears 
that the next poles of the OGF are complex conjugate and lie near —0.76 + 0.44i, having 
modulus about 0.88. The corresponding residues then jointly contribute a quantity of the form 


g2(n) = c- A” sin(wn + wo), A = 1.13290, 


for some constants c, @, @p. Comparing the left-hand and right-hand parts of Figure V.3, we 
see that this next layer of poles explains quite well the residual error ce me 2 g(n). 

Here is finally a variant of compositions into primes that demonstrates in a striking way 
the scope of the method. Define the set Prime of “twinned primes” as the set of primes that 
belong to a twin prime pair, that is, p € Prime? if one of p — 2, p +2 is prime. The set Prime 
starts as 3, 5,7, 11, 13, 17, 19, 29, 31, ... (prime numbers like 23 or 37 are thus excluded). The 
asymptotic formula for the number of compositions of the integer n into summands that are 
twinned primes is 


Prime2  () 18937 - 1.29799", 


where the constants are found by methods analogous to the case of all primes. It is quite 
remarkable that the constants involved are still computable real numbers (and of low complexity, 
even), this despite the fact that it is not known whether the set of twinned primes is finite or 
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infinite. Incidentally, a sequence that starts like C, p zune2 


1,0,0, 1,0, 1, 1,1, 2, 1, 3,4, 3,7, 7,8, 14, 15, 21, 28, 33,47, 58,... 


and coincides till index 22 included (!), but not beyond, was encountered by MacMahon!, as the 
authors discovered, much to their astonishment, from scanning Sloane’s Encyclopedia, where 
it appears as EIS AQ02124. 2... ccc ene eee teen en neereenaes | 


> V.2. Random generation of supercritical sequences. Let F = SEQ(G) be a supercritical 
sequence scheme. Consider a sequence of i.i.d. (independently identically distributed) random 


variables Y;, Y2, ... each of them obeying the discrete law 
PY¥=h=go*%, k2>1. 
A sequence is said to be hitting n if Yj +---+Y,; =n forsomer > 1. The vector (Y,,..., Y;) 


for a sequence conditioned to hit has the same distribution as the sequence of the lengths of 
components in a random F—object of size n. 

For probabilists, this explains the shape of the formulae in Theorem V.1, which resemble 
renewal relations [205, Sec. XIII.10]. It also implies that, given a uniform random generator for 
G-objects, one can generate a random F-—object of size n in O(n) steps on average [177]. This 
applies to surjections, alignments, and compositions in particular. 
> V.3. Largest components in supercritical sequences. Let F = SEQ(G) be a supercritical 
sequence. Assume that g, = [zk ]G(z) satisfies the asymptotic “smoothness” condition 

ge ~ cp *kP, cp eRso, BER. 
k-> 00 


Then the size L of the largest G component in a random F-object satisfies, for size n, 


1 
Ez (L)= fegtayey (logn + f log log n) + o(log log n). 


This covers integer compositions (9 = 1, 6 = 0) and alignments (9 = 1, 6 = —1). [The 

analysis generalizes the case of longest runs in Example V.4 (p. 308) and is based on similar 
-1 

principles. The GF of F objects with L < mis F’")(z) = (1 = en giz’) , according to 

Section III.7. For m large enough, this has a dominant singularity which is a simple pole at oy 

such that om — 0 ~ cy (o/ p)™mP . There follows a double-exponential approximation 


Pr (l < m) ~exp (~conm op) 


in the “central” region. See Example V.4 (p. 308) for a particular instance and Gourdon’s 
study [305] for a general theory. ] dq 


V.3. Regular specifications and languages 


The purpose of this section is the general study of the (+, x, SEQ) schema, which 
covers all regular specifications. As we show now, “pure” exponential—polynomial 
forms (ones with a single dominating exponential) can always be extracted. Theo- 
rems V.3 and V.4 below provide a universal framework for the asymptotic analysis 
of regular classes. Additional structural conditions to be introduced in later sections 
(nested sequences, irreducibility of the dependency graph and of transfer matrices) 
will then be seen to induce further simplifications in asymptotic formulae. 


'See “Properties of prime numbers deduced from the calculus of symmetric functions”, Proc. London 
Math. Soc., 23 (1923), 290-316). MacMahon’s sequence corresponds to compositions into arbitrary odd 
primes, and 23 is the first such prime that is not twinned. 
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V.3.1. Combinatorial aspects. For convenience and without loss of analytic 
generality, we consider here unlabelled structures. According to Chapter I (Defini- 
tion I.10, p. 51, and the companion Proposition I.2, p. 52), a combinatorial specifica- 
tion is regular if it is non-recursive (“iterative’’) and it involves only the constructions 
of Atom, Union, Product, and Sequence. A language £ is S—regular if it is com- 
binatorially isomorphic to a class M described by a regular specification. Alterna- 
tively, a language is S—regular if all the operations involved in its description (unions, 
catenation products and star operations) are unambiguous. The dictionary translating 
constructions into OGFs is 


(ll) F+GH F+G, FxGwW FxG, SEQ(F) H (1- F)7, 


and for languages, under the essential condition of non-ambiguity (Appendix A.7: 
Regular languages, p. 733), 


(12) LUM L+M, L-Mw LxM, | dag on ae 2 


The rules (11) and (12) then give rise to generating functions that are invariably ra- 
tional functions. Consequently, given a regular class C, the exponential—polynomial 
form of coefficients expressed by Theorem IV.9 (p. 256) systematically applies, and 
one has 


(13) Cr = k"IC@) = > @a;", 
j=l 


for a family of algebraic numbers aj (the poles of C(z)) and a family of polynomi- 
als I;. 

As we know from the discussion of periodicities in Section IV. 6.1 (p. 263), the 
collective behaviour of the sum in (13) depends on whether or not a single a domi- 
nates. In the case where several dominant singularities coexist, fluctuations of sorts 
(either periodic or irregular) may manifest themselves. In contrast, if a single a dom- 
inates, then the exponential—polynomial formula acquires a transparent asymptotic 
meaning. Accordingly, we set: 


Definition V.2. An exponential—-polynomial form Dia! Il; (n)a;” is said to be pure if 
lai| < |a;|, for all j = 2. In that case, a single exponential dominates asymptotically 
all the other ones. 


As we see next for regular languages and specifications, the corresponding count- 
ing coefficients can always be described by a finite collection of pure exponential— 
polynomial forms. The fundamental reason is that we are dealing with a special subset 
of rational functions, one that enjoys strong positivity properties. 
> V.4. Positive rational functions. Define the class Ratt of positive rational functions as 
the smallest class containing polynomials with positive coefficients (Rs [z]) and closed under 


sum, product, and quasi-inverse, where O(f) = (1 — f \ra is applied to elements f such that 
f(O) = 0. The OGF of any regular class with positive weights attached to neutral structures 
and atoms is in Rat*. Conversely, any function in Ratt is the OGF of a positively weighted 
regular class. The notion of a Ratt function is for instance relevant to the analysis of weighted 
word models and Bernoulli trials (Section III. 6.1, p. 189). J 


302 V. APPLICATIONS OF RATIONAL AND MEROMORPHIC ASYMPTOTICS 


V.3.2. Analytic aspects. First we need the notion of sections of a sequence. 


Definition V.3. Let (fn) be a sequence of numbers. Its section of parameters D,r, 
where D € Zs9 andr € Zso is the subsequence (fnp+r). The numbers D and r are 
referred to as the modulus and the base, respectively. 


The main theorem describing the asymptotic behaviour of regular classes is a 
consequence of Proposition IV.3 (p. 267) and is originally due to Berstel. (See Soit- 
tola’s article [546] as well as the books by Eilenberg [189, Ch VII] and Berstel— 
Reutenauer [56] for context.) 


Theorem V.3 (Asymptotics of regular classes). Let S be a class described by a regular 
specification. Then there exists an integer D such that each section of modulus D of 
Sn that is not eventually 0 admits a pure exponential-polynomial form: for n larger 
than some no, and any such section of base r, one has 


m 
Sn = U(n)p" + >», Pj(n)B; n=rmod D, 
j=l 


where the quantities B, B;, with B > |B;\, and the polynomials I1, P;, with I(x) ¥ 9, 
depend on the base r. 


Proof. (Sketch.) Let a; be the dominant pole of $(z) that is positive. Proposition IV.3 
(p. 267) asserts that any dominant pole, a is such that a/|a| is a root of unity. Let Do 
be such that the dominant singularities are all contained in the set {ajo/—! ee > Where 
@ = exp(2iz/Do). By collecting all contributions arising from dominant poles in the 
general expansion (13) and by restricting n to a fixed congruence class modulo Do, 
namely n = vDo +r withO <r < Do, one gets 


(14) Svdo+r = Tnyay?” + O(AR"). 


There IT!"! is a polynomial depending on r and the remainder term represents an ex- 
ponential polynomial with growth at most O(A~") for some A > ay. 

The sections with modulus Do that are not eventually 0 can then be categorized 
into two classes. 


— Let R40 be the set of those values of r such that T1!"! is not identically 0. 
The set 740 is non-empty (else the radius of convergence of S(z) would be 
larger than a.) For any base r € R40, the assertion of the theorem is then 
established with 6 = 1/ay,. 

— Let Ro be the set of those values of r such that I!” (x) = 0, with TH”! as 
given by (14). Then one needs to examine the next layer of poles of S(z), as 
detailed below. 


Consider a number r such that r € Ro, so that the polynomial II"! is identically 0. 
First, we isolate in the expansion of S(z) those indices that are congruent to r modulo 
Do. This is achieved by means of a Hadamard product, which, given two power series 
a(z) = >) ayz” and b(z) = >) bnz", is defined as the series c(z) = > cnz” such that 
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Cn = Anby and is written c = a © b. In symbols: 


(15) dae) O.| > bac |) => abn” 


n>0 n>0 n>0 


We have: 
(16) g(2) = S(O (5) . 


A classical theorem [57, 189] from the theory of positive rational functions (in the 
sense of Note V.4) asserts that such functions are closed under Hadamard product. (A 
dedicated construction for (16) is also possible and is left as an exercise to the reader.) 
Then the resulting function G(z) is of the form 


a(z) =z" (2), 


with the rational function y (z) being analytic at 0. Note that we have [z’]y(z) = 
Sy Do+r, SO that y is exactly the generating function of the section of base r of S(z). 
One verifies next that y(z), which is obtained by the substitution z 4 z!/?0 in 
g(z)z_”, is itself a positive rational function. Then, by a fresh application of Bers- 
tel’s Theorem (Proposition IV.3, p. 267), this function, if not a polynomial, has a 
radius of convergence p with all its dominant poles o being such that o/p is a root of 
unity of order D;, for some D,; > 1. The argument originally applied to S$(z) can thus 
be repeated, with y (z) replacing S(z). In particular, one finds at least one section (of 
modulus Dj) of the coefficients of y (z) that admits a pure exponential—polynomial 
form. The other sections of modulus D; can themselves be further refined, and so on 

In other words, successive refinements of the sectioning process provide at each 
stage at least one pure exponential—polynomial form, possibly leaving a few congru- 
ence classes open for further refinements. Define the layer index of a rational function 
f as the integer x(f), such that 


K(f) =card {I¢] | f(Q) = oo}. 


(This index is thus the number of different moduli of poles of f.) It is seen that each 
successive refinement step decreases by at least 1 the layer index of the rational func- 
tion involved, thereby ensuring termination of the whole refinement process. Finally, 
the collection of the iterated sectionings obtained can be reduced to a single section- 
ing according to a common modulus D, which is the least common multiple of the 
collection of all the finite products Dp Dj, --- that are generated by the algorithm. Mf 


For instance the coefficients (Figure V.4) of the function 
1 4 Zz 
d—z)d—-z2-z4) 1-323’ 


associated to the regular language a*(bb + cccc)* +d(ddd + eee + fff)”, exhibit an 
apparently irregular behaviour, with the expansion of L(z) starting as 


(17) L(z) = 


124 De? 4:22? Het dl 4 78 + 162? 41928 192 4-472)? 4+ 202M pe, 
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Figure V.4. Plots of log Fy with Fy = [z”]F(z) and F(z) as in (17) display fluctua- 
tions that disappear as soon as sections of modulus 6 are considered. 


The first term in (17) has a periodicity modulo 2, while the second one has an obvious 
periodicity modulo 3. In accordance with the theorem, the sections modulo 6 each 
admit a pure exponential—-polynomial form and, consequently, they become easy to 
describe (Note V.5). 


> V.5. Sections and asymptotic regimes. For the function L(z) of (17), one finds, with g := 
(1 + /5)/2 and cy}, co € Ryo, 


Ln = 3-1/3 37/3 + O(g"/2) (n = 1,4 mod 6), 


Ln = ¢19"/2 + O(1) (n = 0,2 mod 6), 
Ln = c29"!2 + O(1) (n = 3,5 mod 6), 
in accordance with the general form predicted by Theorem V.3. dq 


> V.6. Extension to Rat* functions. The conclusions of Theorem V.3 hold for any function 
in Rat* in the sense of Note V.4. dq 


> V.7. Soittola’s Theorem. This is a converse to Theorem V.3 proved in [546]. Assume that 
coefficients of an arbitrary rational function f(z) are non-negative and that there exists a sec- 
tioning such that each section admits a pure exponential—polynomial form. Then f(z) is in 
Ratt in the sense of Note V.4; in particular, f is the OGF of a (weighted) regular class. J 


Theorem V.3 is useful for interpreting the enumeration of regular classes and 
languages. It serves a similar purpose with regards to structural parameters of regular 
classes. Indeed, consider a regular specification C augmented with a mark u that is, as 
usual, a neutral object of size 0 (see Chapter III). We let C(z, u) be the corresponding 
BGF of C, so that Cy, = [z”u*|C(z, u) is the number of C-objects of size n that bear k 
marks. A suitable placement of marks makes it possible to record the number of times 
any given construction enters an object. For instance, in the augmented specification 
of binary words, 


C = (SEQ<;(b) + u SEQs,;(b)) SEQ(a(SEQ<;(b) + u SEQs,(b))), 
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all maximal runs of b having length at least r are marked by a uw. There results the 
following BGF for the corresponding parameter “number of runs of bs of length > r’”, 
ts 


l-z uz" 1 
(18) can = (5 ty ): =e 
—Z —Z 1-:(= + #2) 


1-z 1-z 


from which mean and variance can be determined. In general, marks make it possible 
to analyse profile, with respect to constructions entering the specification, of a random 
object. 

Theorem V.4 (Profile of regular classes). Consider a regular specification of a class C, 
augmented with a mark and let y be the parameter corresponding to the number of 
occurrences of that mark. There exists a sectioning index d such that for any fixed 
section of (Cy) of modulus d, the following holds: the moment of integral order s > 1 
of x satisfies an asymptotic formula 


(19) ic, LX] = Q(n)B" + O(G"), 


where the quantities B, Q, G depend on the particular section considered, withO < 
B <1, Q(n) a rational fraction, andG < f. 


(Only sections that are not eventually 0 are to be considered.) 


Proof. The case of expectations suffices to indicate the lines of a general proof. One 
possible approach’ is to build a derived specification € such that 


En 


which is also a regular specification. To this purpose, define a transformation on 
specifications defined inductively by the rules 


0(A+ B) =0A+O0B, a(A x B)=0A x B+Ax OB, 
6 SEQ(A) = SEQ(A) x 0A x SEQ(A), 


together with the initial conditions du = 1 and 0Z = 9. This is a form of combina- 
torial differentiation: an object y € C corresponds to y(y) objects in €, namely, one 
for each choice of an occurrence of the mark. 

As a consequence, E,, is the cumulated value of y over Cy, so that En/Cyn = 
ic, [x]. On the other hand, € is a regular specification to which Theorem V.3 ap- 
plies. The result follows upon considering (if necessary) a sectioning that refines the 
sectionings of both C and €. The argument extends easily to higher moments. | 


V.8. A rational mean. Consider the regular language C = a*(b + c)*d(b +c)*. Let y be the 
length of the initial run of a’s. Then one finds 


z ze 


—— E() = ——.——_}.. 
(1 — z)( — 2z)2 ) (1 — z)?(1 — 2z)2 
Thus the mean of y satisfies 


En _ (n= 3)2" + (n $3) _ 03 ey 
Ec, l= @ = (n—1)2" +1 -"=7+0((3)). 


C(z) = 


Equivalently, one may operate at generating function level and observe that the derivative of a Ratt 
function is Ratt; cf Notes V.4 and V.6. 
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Class Asymptotics 
Integer compositions gna 

k-1 
— k summands ~ ep! ($I. 3.1, p. 44) 
— summands < r ~ cP (§I. 3.1, p. 42) 
Integer partitions 
— k summands ~ gata (813.1, p. 44) 

r-l 

— summands < r ~ r=! (SL. 3.1, p. 43) 
Set partitions, k classes ~ a ($1. 4.3, p. 62) 
Words excluding a pattern p ~ By (SIV. 6.3, p. 271) 


Figure V.5. A pot-pourri of regular classes and their asymptotics. 


Generally, in the statement of Theorem V.4, let Q(n) = A(n)/B(n) with A, B polynomials and 
a = deg(A), b = deg(B). The following combinations prove to be possible (for first moments): 
f = 1 and (a, b) any pair such that 0 < a < b+ 1; also, f < 1 and (a, b) any pair of elements 
> 0. | 


> V.9. Shuffle products. Let £L, M be two languages over two disjoint alphabets. Then, the 
shuffle product S of £ and M is such that S(z) = L(z) - M(z), where S, L, M are the expo- 
nential generating functions of S, £, M. Accordingly, if the OGF L(z) and M(z) are rational 
then the OGF S(z) is also rational. (This technique may be used to analyse generalized birthday 
paradox and coupon collector problems; see [231].) dq 


V.3.3. Applications. This subsection details several examples that illustrate the 
explicit determination of exponential—-polynomial forms in regular specifications, in 
accordance with Theorems V.3 and V.4. We start by recapitulating a collection, a 
“pot-pourri’”, of combinatorial problems already encountered in Part A, where rational 
generating functions have been used en passant. We then examine longest runs in 
words, walks of the pure-birth type, and subsequence (hidden pattern) statistics. 


Example V.3. A pot-pourri of regular specifications. A few combinatorial problems, to be 
found scattered across Chapters I-IV, are reducible to regular specifications: see Figure V.5 for 
a summary. 


Compositions of integers (Section I.3, p. 39) are specified by C = SEQ(SEQ>1(Z)), 
whence the OGF (1 — z)/(1 — 2z) and the closed form C, = 2”7!, an especially transpar- 
ent exponential—polynomial form. Polar singularities are also present for compositions into k 
summands that are described by SEQ; (SEQ>(Z)) and for compositions whose summands are 
restricted to the interval [1 ..r] (ie., SEQ(SEQ, __,(Z)), with corresponding generating func- 


tions 


zk 1-z 


(ld — zk’ L—2¢4 2th 


In the first case, there is an explicit form for the coefficients, (ene which constitutes a partic- 
ular exponential—polynomial form (with the basis of the exponential being 1). The second case 
requires a dedicated analysis of the dominant polar singularity, which is recognizably a variant 
of Example V.4 (p. 308 below) dedicated to longest runs in random binary words. 
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Integer partitions involve the multiset construction. However, when summands are re- 
stricted to the interval [1 ..7], the specification and the OGF are given by 


1 
1-2 


MSET(SEQ, ._-(Z)) = SEQ(Z) x SEQ(27) x ++ SEQ(Z") ==> I] 
j=l 


This case, introduced in Section I. 3 (p. 39) also served as a leading example in our discussion 
of denumerants in Example IV.6 (p. 257): the analysis of the pole at 1 furnishes the domi- 
nant asymptotic behaviour, n”—!/(r!(r — 1)!), for such special partitions. The enumeration of 
partitions by number of parts then follows, by duality, from the staircase representation. 

Set partitions are typically labelled objects. However, when suitably constrained, they can 
be encoded by regular expressions; see Section I. 4.3 (p. 62) for partitions into k classes, where 
the OGF found is 

k n 
Wz) = . jaan ® & 
SUZ) = @-p0-29-- ky implying Sn re 
and the asymptotic estimate results from the partial fraction decomposition and the dominant 
pole at 1/k. 

Words lead to many problems that are prototypical of the regular specification framework. 
In Section I. 4 (p. 49), we saw that one could give a regular expression describing the set of 
words containing the pattern abb, from which the exact and asymptotic forms of counting 
coefficients derive. For a general pattern p, the generating functions of words constrained to 
include (or dually exclude) p are rational. The corresponding asymptotic analysis has been 
given in Section IV. 6.3 (p. 271). 

Words can also be analysed under the Bernoulli model, where letter i is selected with 
probability p;; cf Section II.6.1, p. 189, for a general discussion including the analysis of 
records in random words (p. 190). 1.0... .. cece eee ee eee ete n nett n beeen nee ences || 


> V.10. Partially commutative monoids. Let W = A* be the set of all words over a finite 
alphabet A. Consider a collection C of commutation rules between pairs of elements of A. For 
instance, if A = {a,b,c}, then C = {ab = ba, ac = ca} means that a commutes with both b 
and c, but bc is not a commuting pair: bc 4 cb. Let M = W/[C] be the set of equivalent 
classes of words (monomials) under the rules induced by C. The set M is said to be a partially 
commutative monoid or a trace monoid [105]. 

If A = {a, b}, then the two possibilities for C are C = 9 and C := {ab = ba}. Normal 
forms for M are given by the regular expressions (a + b)* and a*b* corresponding to the OGFs 


1 1 
l-a-b’ l-a—b+ab 
If A = {a, b,c}, the possibilities for C, the corresponding normal forms, and the OGFs M are 
as follows. If C = @, then. M = (a+b+c)* with OGF (1 -—a—b— c)7!; the other cases are 


ab = ba ab = ba, ac=ca ab = ba, ac =ca, bc = cb 
(a*b*c)*a*b* a*(b +)* ax*b*c* 

1 1 1 
l-—a-—b-—ctab 1-a-—b-—c+abt+ac 1-a—b—ct+ab+act+be- abc’ 


Cartier and Foata [105] have discovered the general form (based on extended Mobius inversion), 


-1 
M= (Seve) : 
F 


where the sum is over all monomials F composed of distinct letters that all commute pairwise. 
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Viennot [597] has discovered an attractive geometric presentation of partially commutative 
monoids in terms of heaps of pieces, which has startling applications to several areas of combi- 
natorial theory. (Example I.18, p. 80, relative to animals provides an example.) Goldwurm and 
Santini [298] have shown that [z”]M(z) ~ K -a" for K,a > 0. <J 


Longest runs. It is possible to develop a complete analysis of runs of consecutive 
equal letters in random sequences: this is in theory a special case of the analysis 
of patterns in random texts (Section IV. 6.3, p. 271), but the particular nature of the 
patterns makes it possible to derive much more explicit results, including asymptotic 
distributions. 


Example V.4._— Longest runs in words Longest runs in words, introduced in Section I. 4.1 
(p. 51), provide an illustration of the technique of localizing dominant singularities in rational 
functions and of the corresponding coefficient extraction process. The probabilistic problem is 
a famous one, discussed by Feller in [205]: it represents a basic question in the analysis of runs 
of good (or bad) luck in a succession of independent events. Our presentation closely follows 
an insightful note of Knuth [375] whose motivation was the analysis of carry propagation in 
certain binary adders. 


Start from the class W of all binary words over the alphabet {a,b}. Our interest lies in 
the length L of the longest consecutive block of a’s in a word. For the property L < k, the 
specification and the corresponding OGF are 


1 — 2 1 
w® = Sra x(a) SEQbSEQA@)) == WH@=——.—__, 
2 1-24 
that is, 
1— 2k 
20 Ww @) = —_—_. 
oe = Ta 


This represents a collection of OGFs indexed by k, which contain all the information relative to 
the distribution of longest runs in random words. We propose to prove: 


Proposition V.1. The longest run parameter L taken over the set of binary words of length n 
(endowed with the uniform distribution) satisfies the uniform estimate? 


—) a(n) = 2U8"), 


dh 


(21) Pa(L < lgn| +h) =e 4@™2"" 4.0 ( 


In particular, the mean satisfies 


3 log? 
il) tant Ly 5 + Pen) + Of : *), 


log Jn 


where P is a continuous periodic function whose Fourier expansion is given by (29). The 
variance satisfies Vj (L) = O(1) and the distribution is concentrated around its mean. 


The probability distributions appearing in (21) are known as double exponential distributions 
(Figure V.6, p. 311). The formula (21) does not represent a single limit distribution in the usual 
sense of Chapter IX, but rather a whole family of distributions indexed by the fractional part of 
lgn, thus dictated by the way n places itself with respect to powers of 2. 


3The symbol lg x denotes the binary logarithm, lgx = logy x, and {x} is the fractional part function 
({z} = 0.14159.-.,. 
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Proof. The proof consists of the following steps: locate the dominant pole; estimate the cor- 
responding contribution; separate the dominant pole from the other poles in order to derive 
constructive error terms; finally approximate the main quantities of interest. 


(i) Location of the dominant pole. The OGF Ww‘) has, by the first form of (20), a dominant 
pole pz, which is a root of the equation 1 = s(pz), where s(z) = z(1 —2‘y/( —z). We consider 
k > 2. Since s(z) is an increasing polynomial and s(0) = 0, s(1/2) < 1, s(1) =k, the root pz 
must lie in the open interval (1/2, 1). In fact, as one easily verifies, the condition k > 2 
guarantees that s(0.6) > 1, hence the first estimate 

1 3 


(22) 5 <Pk< = k>2). 


It now becomes possible to derive precise estimates by bootstrapping. (This technique is a 
form of iteration for approaching a fixed point—its use in the context of asymptotic expansions 
is detailed in De Bruijn’s book [143].) Writing the defining equation for px as a fixed point 
equation, 


1 k+1 
=_(] ' 
‘ 3! aX) 


and making use of the rough estimates (22) yields next 


1 1\*t! 1 gy eet 


Thus, px is exponentially close to a and further iteration from (23) shows 


1 1 k 
) Pk= 5 + aa t O (sr). 


(ii) Contribution from the dominant pole. A straightforward calculation provides the value 
of the residue, 


(25) Rn 4 = —Res [w (271; z2= Pr | - 


> 


which is expected to provide the main approximation to the coefficients of wk) 


=n jt 


asn 7 Ww. 


The quantity in (25) is of the rough form 2”e 
shortly. 


; we shall return to such approximations 


(iii) Separation of the subdominant poles. Consider the circle |z| = 3/4 and take the 
second form of the denominator of W‘*), namely, that of (20): 


pea. 


In view of Rouché’s theorem (p. 270), we may regard this polynomial as the sum f(z) + g(z), 
where f(z) = 1 — 2z and g(z) = z+. The term F(z) has on the circle |z| = 3/4 a modulus 
that varies between 1/2 and 5/2; the term g(z) is at most 27/64 for any k > 2. Thus, on the 
circle |z| = 3/4, one has |g(z)| < |f(z)|, so that f(z) and f(z) + g(z) have the same number 
of zeros inside the circle. Since f(z) admits z = 1/2 as only zero there, the denominator must 
also have a unique root in |z| < 3/4, and that root must coincide with p x. 

Similar arguments also give bounds on the error term when the number of words w satisfy- 
ing L(w) < k is estimated by the residue (25) at the dominant pole. On the circle |z| = 3/4, the 
denominator of W‘*) stays bounded away from 0 (its modulus is at least 5/64 when k > 2, by 
previous considerations). Thus, the modulus of the remainder integral is O((4/3)”), and in fact 
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bounded from above by 35(4/3)”. In summary, letting q,4 represent the probability that the 
longest run in a random word of length n is less than k, one obtains the main estimate (k > 2) 


oA (yo) 
26 ef = PL b= o((=) ), 
es ce vat) 1—(& + Ipk/2 \2px ENG 


which holds uniformly with respect to k. Here is a table of the numerical values of the quantities 
appearing in the approximation of qy,, written under the form cg - (2p,)~": 


k Ce: 2px)" 


2 1.17082 - 0.80901” 
3. 1.13745 - 0.91964” 
4 1.09166 - 0.96378” 
5 1.05753 - 0.98297” 
10 1.00394 - 0.99950” 


(iv) Final approximations. There only remains to transform the main estimate (26) into 
the limit form asserted in the statement. First, the “tail inequalities” (with lg x = log» x) 


27) Py ( < i's) =0 (72), Py (L > 2ignty)= oo) 
n 


describe the tail of the probability distribution of L;,. They are derived from simple bounding 
techniques applied to the main approximation (26) using (24). Thus, for asymptotic purposes, 
only a relatively small region around Ign needs to be considered. 

Regarding the central regime, for k = lgn + x and x in [-4 Ign, Ign], the approxima- 
tion (24) of pz and related quantities applies, and one finds 


—n _ n —2k eee [ake logn 
(2px) "exp (- sr + O€n2 =e u (1+0(*2)). 


(This results from standard expansions of the form (1 — a)” = e~"“ exp(O(na?)).) At the 
same time, the coefficient in (26) of the quantity (2p,)~” is 


i: logn 
1+ O(kp;) =1+0 Re hs 


Thus a double exponential approximation holds (Figure V.6): for k = Ign + x with x in 
-} lg n, lg n], one has (uniformly) 


_y/ok+1 logn 
(28) die ee tl? (1+0(“*)). 


In particular, upon setting k = [lgn| + h and making use of the tail inequalities (27), the first 
part of the statement, namely Equation (21), follows. (The floor function takes into account the 
fact that k must be an integer.) 

The mean and variance estimates are derived from the fact that the distribution quickly 
decays at values away from lg n (by (27)) while it satisfies Equation (28) in the central region. 
The mean satisfies 


2 
En(L) = D1 -Pa(L <A)l = (5) 1+ o(*). (x) = [pe]. 


h>1 h>0 


Consider the three cases h < ho, h € [ho, hy], andh > hy with hg = lgx — loglog x and 
h, = 1gx + log log x, where the general term is (respectively) close to 1, between 0 and 1, and 
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owe 46 B10 12 «14 


4-3 -2 +1 1 2 3 4 


Figure V.6. The double exponential laws: Left, histograms for n at 2? (black), 
2P+1/3 (dark gray), and 2?+2/3 (light gray), where x = k — gn. Right, empiri- 
cal histograms for 1000 simulations with n = 100 (top) and n = 140 (bottom). 


close to 0. By summing, one finds elementarily (x) = lg x + O(loglog x) as x — oo. (An 
elementary way of catching the next O(1) term is discussed for instance in [538, p. 403].) 

The method of choice for precise asymptotics is to treat B(x) as a harmonic sum and apply 
Mellin transform techniques (Appendix B.7: Mellin transforms, p. 762). The Mellin transform 
of D(x) is 


[o.@) 
r 
®*(s) = [ oe tdes Sl aajeciiny, 
0 1-25 

The double pole of ©* at 0 and the simple poles at s = as are reflected by an asymptotic 
expansion that involves a Fourier series: 
(29) 

4 I -1 i 2ikn —2ikxw 
O(x) =1 ——+-+4+Pi(1 O » P == 7 r {| — : 

@)=lext ats tPsx)+0G™), PWw)i=-F 5 o> (= e 


keZ\{0} 


The oscillating function P(w) is found to have tiny fluctuations, of the order of 10-6; for 
instance, the first Fourier coefficient has amplitude: |[(2iz/ log 2)|/log2 = 7.86 - 10-7. (See 
also [234, 311, 375, 564] for more on this topic.) The variance is similarly analysed. This 
concludes the proof of Proposition V.1. | 


The double exponential approximation in (21) is typical of extremal statistics. What is 
striking here is the existence of a family of distributions indexed by the fractional part of lgn. 
This fact is then reflected by the presence of oscillating functions in moments of the random 
Walla DIG 165. ci facet iaaiat stars an avtteer nares ae Rea Ae ae arena nate area MR Mee eee Aa | 


> V.11. Longest runs in Bernoulli sequences. Consider an alphabet A = {aj} with letter a; 
independently chosen with probability {p ;}. The OGF of words where each run of equal letters 
has length at most k is derived from the construction of Smirnov words (pp. 204 and 262), and 
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it is found to be ‘ 


1 = (viz) 
WN) = (1—>) iz 
ps LG 


Let pmax be the largest of the p;. Then the expected length of the longest run of any letter is 
log n/ log pmax + O(1), and precise quantitative information can be derived from the OGFs by 
methods akin to Example IV.10 (Smirnov words and Carlitz compositions, p. 262). dq 

Walks of the pure-birth type. The next two examples develop the analysis of 
walks in a special type of graphs. These examples serve two purposes: they illus- 
trate further cases of modelling by means of regular specifications, and they provide 
a bridge to the analysis of lattice paths in the next section. Furthermore, some spe- 
cific walks of the pure-birth type turn out to have applications to the analysis of a 
probabilistic algorithm (Approximate Counting). 


Example V.5. Walks of the pure-birth type. Consider a walk on the non-negative integers that 
starts at 0 and is only allowed either to stay at the same place or move by an increment of +1. 
Our goal is to enumerate the walks that start from 0 and reach point m in n steps. A step from j 
to j + 1 will be encoded by a letter a;; a step from j to j will be encoded by c;, in accordance 
with the following state diagram: 


co Cl c2 
(30) 


a 
a0 a) a2 
The language encoding all legal walks from state 0 to state m can be described by a regular 
expression: 
Ho,m = SEQ(co)a9 SEQ(C] aq «+ SEQ(Cm—1)4m—1 SEQ(Cm). 

Symbolicly using letters as variables, the corresponding ordinary multivariate generating func- 
tion is then (with a = (ag, ...) ande = (co, ...)) 

aga] -*:am-1 
(eg) = ey) (l= Ga) 

Assume now that the steps are assigned weights, with a ; corresponding to a; and y; to cj. 
Weights of letters are extended multiplicatively to words in the usual way (cf Section IIL 6.1, 
p. 189). In addition, upon taking y; = 1 —a@;, one obtains a probabilistic weighting: the walker 
starts from position 0, and, if at 7, at each clock tick, she either stays at the same place with 


probability 1 — a; or moves to the right with probability a ;. The OGF of such weighted walks 
then becomes 


(31) Ao (Z) = 


Hom (a, c) = 


agay,: + Om—12" 
(1 — C= a9)z) — = @1)z)-+- A — (= @m)z)’ 


and [z”]Ho,m is the probability for the walker to be found at position m at (discrete) time n. 


This walk process can be alternatively interpreted as a (discrete-time) pure-birth process* in 


the usual sense of probability theory: There is a population of individuals and, at each discrete 
epoch, a new birth may take place, the probability of a birth being « ; when the population is of 
size j. 


“The theory of pure-birth processes is discussed under a calculational and non measure-theoretic 
angle in the book by Bharucha-Reid [62]. See also the Course by Karlin and Taylor [363] for a concrete 
presentation. 
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Figure V.7. A simulation of 10 trajectories of the pure-birth process till n = 1024, 
with geometric probabilities corresponding to g = 1/2, compared to the curve log, x. 


The form (31) readily lends itself to a partial fraction decomposition. Assume for simplic- 
ity that the a; are all distinct. The poles of Ho, are at the points (1 — a)! and one finds as 
zo d- Gr 
GAG] ***Aam—1 


I] (a, — aj) 


ke[O,m], kéj 


Tig 


1—z(—aj) 


Fm (z) ~ where jy (= 
Thus, the probability of being in state m at time n is given by a sum: 
m 
(32) [2"]Ho,m(@) = >) rjm(Q— aj)". 
j=0 
An especially interesting case of the pure-birth walk is when the quantities a, are geomet- 


Tic: ap = gk for some g withO <q < 1. In that case, the probability of being in state m after n 
transitions becomes (cf (32)) 


o pig®) 
33 par deci 
me 2 (9); (Q)m-j 


d-qg™)", q@)j :=(-4@) 4?) 4’). 


This corresponds to a stochastic progression in a medium with exponentially increasing hard- 
ness or, equivalently, to the growth of a population whose size adversely affects fertility in an 
exponential manner. On intuitive grounds, we expect an evolution of the process to stay reason- 
ably close to the curve y = log; /q 3 see Figure V.7 for a simulation confirming this fact, which 
can be justified by means of formula (33). This particular analysis is borrowed from [218], 
where it was initially developed in connection with the “approximate counting” algorithm to be 
Studied Next? Asa TU Are MR REA aoa Net tee eee eM ASO RERE SSL Lbs sd | 


Example V.6. Approximate Counting. Assume you need to keep a counter that is able to 
record the number of certain events (say impulses) and should have the capability of keeping 
counts till a certain maximal value N. A standard information-theoretic argument (with @ bits, 
one can only keep track of 2° possibilities) implies that one needs [logs (N + 1)] bits to perform 
the task—a standard binary counter will indeed do the job. However, in 1977, Robert Morris 
has proposed a way to maintain counters that only requires of the order of log log N bits. What’s 
the catch? 

Morris’ elegant idea consists in relaxing the constraint of exactness in the counting process 
and, by playing with probabilities, tolerate a small error on the counts obtained. Precisely, his 
solution maintains a random quantity Q which is initialized by Q = 0. Upon receiving an 
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impulse, one updates Q according to the following simple procedure (with q € (0, 1) a design 
parameter): 


procedure Update(Q); 
with probability g2 do O := Q + 1 (else keep QO unchanged). 
When asked the number of impulses (number of times the update procedure was called) at any 
moment, simply use the following procedure to return an estimate: 


procedure Answer(Q); 
q 2-1 
I-q 
Let Qn be the value of the random quantity Q after n executions of the update procedure 
and X,, the corresponding estimate output by the algorithm. It is easy to verify (by recurrence 
or by generating functions; see Note V.12 below for higher moments) that, for n > 1, 


(34) E(q~2") =n(1—q) +1, so that E(X,) =n. 


output X = 


Thus the answer provided at any instant is an unbiased estimator (in a mean value sense) of 
the actual count n. On the other hand, the analysis of the geometric pure-birth process in 
the previous example applies. In particular, the exponential approximation (1 — a)” ~ e7~"@ 
in conjunction with the basic formula (33) shows that for large n and m sufficiently near to 
logy /q Ms one has (asymptotically) the geometric-birth distribution 


co 


35) PQn=m)=> 
g=0 


(-1)iq® 


exp(—q*~/) + o(1), x=m—logygn. 
(4) j (Qoo I/q 


(We refer to [218] for details.) Such calculations imply that Qy is with high probability (w.h.p.) 
close to logi/q n. Thus, ifn < N, the value of Qn will be w.h.p. bounded from above by 
(1 +e) logy /q N, with € a small constant. But this means that the integer Q, which can itself 
be represented in binary, will only require 


(36) logs logn + O(1) 


bits for storage, for fixed q. 

A closer examination of the formulae reveals that the accuracy of the estimate improves 
considerably when g becomes close to 1. The standard error is defined as Lf V(X) and it 
measures, in a mean-quadratic sense, the relative error likely to be made. The variance of Qn 
is, as for the mean, determined by recurrence or generating functions, and one finds 


1—qg)° 1 = 
(37) vag Orth) = (Sa VIO) ~ sa 


(see also Note V.12 below). This means that accuracy increases as q approaches 1 and, by 
suitably dimensioning qg, one can make it asymptotically as small as desired. In summary, 
(34), (37), and (36) express the following property: Approximate counting makes it possible to 
count till N using only about log log N bits of storage, while achieving a standard error that is 
asymptotically a constant and can be set to any prescribed small value. Morris’ trick is now 
fully understood. 

For instance, with g = DANN i proves possible to count up to 2!6 — 65536 using only 
8 bits (instead of 16), with an error likely not to exceed 20%. Naturally, there’s not too much 
reason to appeal to the algorithm when a single counter needs to be managed (everybody can 
afford a few bits!): Approximate Counting turns out to be useful when a very large number of 
counts need to be kept simultaneously. It constitutes one of the early examples of a probabilistic 
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algorithm in the extraction of information from large volumes of data, an area also known as 
data mining; see [224] for a review of connections with analytic combinatorics and references. 
Functions akin to those of (35) also surface in other areas of probability theory. Guillemin, 
Robert, and Zwart [314] have detected them in processes that combine an additive increase and 
a multiplicative decrease (AIMD processes), in a context motivated by the adaptive transmis- 
sion of “windows” of varying sizes in large communication networks (the TCP protocol of the 
internet). Biane, Bertoin, and Yor [58] encountered a function identical to (35) in their study of 
exponential functionals of Poisson processes. .......... 0. cece cece eee eee ence n teen e eee | 


> V.12. Moments of qa Qn. It is a perhaps surprising fact that any integral moment of q7 Qn is 
a polynomial in n, q, and qv, as in (34), (37). To see it, define 
gM pm 


O(w) = O(w, é, gq) := m(m+1)/2 : 
(w) (w, ¢,q) y q (1 +éq)( + &q2)--- (A + éq™t!y 


m>0 


By (31), one has 


1 
y: Ho. m(z)w” = ® { w; z 5q)- 
? 1-z 1-z 


m>0 


On the other hand, ® satisfies ®(w) = 1 — gé(1 — w) ®(qw), hence the q—identity, 


(w) = Sage)! [0 = wy = qu) d= @i-!w)], 


j20 


which belongs to the area of g—calculus>. Thus ®(q~"; ¢, q) is a polynomial for any r € Z>0; 
as the expansion terminates. See Prodinger’s study [498] for connections with basic hypergeo- 
metric functions and Heine’s transformation. 


Hidden patterns: regular expression modelling and moments. We return here 
to the analysis of the number of occurrences of a pattern p as a subsequence in a ran- 
dom text. The mean number of occurrences can be obtained by enumerating contexts 
of occurrences: in a sense we are then enumerating the language of all words by means 
of a dedicated regular expression where the ambiguity coefficient (the multiplicity) of 
a word is precisely equal to the number of occurrences of the pattern. This technique, 
which gives an easy access to expectations, also works for higher moments. It supple- 
ments the fact that there is no easy way to get a BGF in such cases, and it appears to 
be sufficient to derive a concentration of distribution property. 


Example V.7.. Occurrences of “hidden” patterns in Bernoulli texts. Fix an alphabet A = 
{a,,...,4@r} of cardinality r and assume a probability distribution on A to be given, with p; 
the probability of letter a;. We consider the Bernoulli model on W = SEQ(A), where the 
probability of a word is the product of the probabilities of its letters (cf Subsection IIL 6.1, 
p. 189). A word p = yy --- yg called the pattern is fixed. The problem is to gather information 
on the random variable X representing the number of occurrences of p in the set Wy, where 
occurrences as a “hidden pattern”, i.e., as a subsequence, are counted (see Example I.11, p. 54, 
for the case of equiprobable letters). 


SBy q-calculus is roughly meant the collection of special function identities relating power series of 
the form >* an (q)z", where dp (q) is a rational fraction whose degree is quadratic inn. See [15, Ch. 10] for 
basics and [284] for more advanced (q—hypergeometric) material. 
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Mean value analysis. The generating function associated to WV endowed with its proba- 
bilistic weighting is 
1 


W(z) = ras i 


The regular specification 
(38) O = SEQ(A)y; SEQ(A) -- - SEQ(A) yp—1 SEQ(A) yx SEQ(A) 


describes all contexts of occurrences of p as a subsequence in all words. Graphically, this may 
be rendered as follows, for a pattern of length 3 such as p = y; y2y3: 


(39) —_— 1-2-3 


There the boxes indicate distinguished positions where letters of the pattern appear and the 
horizontal lines represent arbitrary separating words (SEQ(A)). The corresponding OGF 


x (p)zk 


(40) O(Z) = G—-oel’ 


m(p) = Py ++ Pyp-1 Pye 
counts elements of W with multiplicity®, where the multiplicity coefficient 2(w) of a word w € 
W is precisely equal to the number of occurrences of p as a subsequence in w: 


O(Z) = = 2(w)x(w)z!”!, 
weA* 
This shows that the mean value of the number X of hidden occurrences of p in a random word 
of length n satisfies 


(ai) Byy,() = £10) = 200)(7). 


which is consistent with what a direct probabilistic reasoning would give. 


Variance analysis. In order to determine the variance of X over Wy, we need contexts in 
which pairs of occurrences appear. Let Q denote the set of all words in W with two occurrences 
(i.e., an ordered pair of occurrences) of p as a subsequence being distinguished. Then clearly 
[z”] Q(z) represents Ey, (X 2). There are several cases to be considered. Graphically, a pair of 
occurrences may share no common position, like in what follows: 


(42) 


eee Oe 
eal 2 — 3 


But they may also have one or several overlapping positions, like in 


(43) 


(44) | 


(This last situation necessitates y2 = y3, typical patterns being abb and aaa.) 


———— 2 
Ie CO 


eC 
eal J2e— 3 


6 In language-theoretic terms, we make use of the regular expression O = A* yy A*--- yp_1 A* yy, A* 
that describes a subset of A* in an ambiguous manner and takes into account the ambiguity coefficients. 
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In the first case corresponding to (42), where there are no overlapping positions, the con- 
figurations of interest have OGF 


2k 2,2k 
(45) 0!) = (er a 


There, the binomial coefficient ay counts the total number of ways of freely interleaving two 
copies of p; the quantity z (p)*z2k takes into account the 2k distinct positions where the letters 
of the two copies appear; the factor (1 — gyre! corresponds to all the possible 2k + 1 fillings 
of the gaps between letters. 

In the second case, let us start by considering pairs where exactly one position is overlap- 
ping, like in (43). Say this position corresponds to the rth and sth letters of p (r and s may be 


unequal). Obviously, we need y; = ys for this to be possible. The OGF of the configurations is 


now 
r+s—2\(2k-r-s x(p)*(py,) 127! 
r-1 k-r d- z)2k : 
There, the first binomial coefficient (a) counts the total number of ways of interleaving 


yp +++ y,-—y and yy --+ ys— 1; the second binomial oa ad oe “) is similarly associated to the inter- 


leavings of y,4 +--+ yg and ys4 +--+ yg; the numerator takes into account the fact that 2k — 1 
positions are now occupied by predetermined letters; finally the factor (1 — z)72k corresponds 
to all the 2k fillings of the gaps between letters. Summing over all possibilities for r, s gives the 
OGF of pairs with one overlapping position as 


(46) Ql) = > (Gare Green m(p)rako) 


- = _ pak 
l<rns<k \ 7” : k—-r Py, (1 —z) 


Similar arguments show that the OGF of pairs of occurrences with at least two shared 
positions (see, e.g., (44)) is of the form, with P a polynomial, 


P() 
(a_- z)2k-1 i 
for the essential reason that, in the finitely many remaining situations, there are at most (2k — 1) 
possible gaps. 

We can now examine (45), (46), (47) in the light of singularities. The coefficient [z”] QW (z) 
is seen to cancel to first asymptotic order with the square of the mean as given in (41). The 
contribution of the coefficient [z”]Q!=?! (z) appears to be negligible as it is O(n?k-?), The 
coefficient [z”]QUI(z), which is o(n2k-1), is seen to contribute to the asymptotic growth of 
the variance. In summary, after a trite calculation, we obtain: 


(47) o!221 (2) = 


Proposition V.2. The number X of occurrences of a hidden pattern p in a random text of size n 
obeying a Bernoulli model satisfies 


_ n\_ x(p) (p)?K(p)? 94 1 
Ey, (X) = in)(7) ~ ate Vy, (X) = “eE=De 1 (1 4 a7). 


where the “correlation coefficient” «(p)? is given by 


r+s—2\/2k-r-s br = ys] 
OD et nee) ay) 


l<r,s<k 


In particular, the distribution of X is concentrated around its mean. 


318 V. APPLICATIONS OF RATIONAL AND MEROMORPHIC ASYMPTOTICS 


This example is based on an article by Flajolet, Szpankowski, and Vallée [263]. There the 
authors show further that the asymptotic behaviour of moments of higher order can be worked 
out. By the Moment Convergence Theorem (Theorem C.2, p. 778), this calculation entails that 
the distribution of X over Wy is asymptotically normal. The method also extends to a much 
more general notion of “hidden” pattern; e.g., distances between letters of p can be constrained 
in various ways so as to determine a valid occurrence in the text [263]. It also extends to the very 
general framework of dynamical sources [81], which include Markov models as a special case. 
The two references [81, 263] thus provide a set of analyses that interpolate between the two 
extreme notions of pattern occurrence—as a block of consecutive symbols or as a subsequence 
(“hidden pattern”). Such studies demonstrate that hidden patterns are with high probability 
bound to occur an extremely large number of times in a long enough text—this might cast some 
doubts on numerological interpretations encountered in various cultures: see in particular the 
critical discussion of the “Bible Codes” by McKay ef al. in [433]. ...............0.000008 3] 


> V.13. Hidden patterns and shuffle relations. To each pairs u, v of words over A associate 
the weighted-shuffle polynomial in the indeterminates A denoted by (0), and defined by the 


LCD C)e—0 0) 
('),-@),-" 


where f is a parameter, x, y are elements of A, and 1 is the empty word. Then the OGF of Q(z) 


above is i 
23 p 
Ore | ()),_. | (1 — z)2k+1? 


where o is the substitution aj +> pj;z. <q 


V.4. Nested sequences, lattice paths, and continued fractions 


This section treats the nested sequence schema, corresponding to a cascade of 
sequences of the rough form SEQoSEQo--- o SEQ. Such a schema covers Dyck 
and Motzkin path, a particular type of Lukasiewicz paths already encountered in Sec- 
tion I.5.3 (p. 73). Equipped with probabilistic weights, these paths appear as trajec- 
tories of birth-and-death processes (the case of pure-birth processes has already been 
dealt with in Example V.5, p. 312). They also have great descriptive power since, 
once endowed with integer weights, they can encode a large variety of combinatorial 
classes, including trees, permutations, set partitions, and surjections. 

Since a combinatorial sequence translates into a quasi-inverse, Q(f) = (1 — 
f)7', a class described by nested sequences has its generating function expressed by 
a cascade of fractions, that is, a continued fraction’. Analytically, these GFs have 
two dominant poles (the Dyck case) or a single pole (the Motzkin case) on their disc 
of convergence, so that the implementation of the process underlying Theorem V.3 
is easy: we encounter a pure polynomial form of the simplest type that describes all 
counting sequences of interest. The profile of a nested sequence can also be easily 
characterized. 


7 Characteristically, the German term for “continued fraction”, is “Kettenbruch”, literally “chain- 
fraction’. 
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This section starts with a statement of the “Continued Fraction Theorem” (Propo- 
sition V.3, p. 321) taken from an old study of Flajolet [214], which provides the general 
set-up for the rest of the section. It then proceeds with the general analytic treatment of 
nested sequences. A number of examples from various areas of discrete mathematics 
are then detailed, including the important analysis of height in Dyck paths and gen- 
eral Catalan trees. Some of these examples make use of structures that are described 
as infinitely nested sequences, that is, infinite continued fractions, to which the finite 
theory often extends—the analysis of coin fountains below is typical. 


V.4.1. Combinatorial aspects. We discuss here a special type of lattice paths 
connecting points of the discrete cartesian plane Z x Z. 


Definition V.4 (Lattice path). A Motzkin path v = (Uo, Uj,..., Un) is a sequence 
of points in the discrete quarter-plane Z>9 x Z30, such that Uj = (j, yj) and the 
jump condition |y;+1 — y;| < 1 is satisfied. An edge (U;, Uj +1) is called an ascent if 
yjti — yy = +1, a descent if yj41 — yj = —1, anda level step if yj41 — yj =0.A 
path that has no level steps is called a Dyck path. 

The quantity n is the length of the path, ini(v) := yo is the initial altitude, 
fin(v) := yy, is the final altitude. A path is called an excursion if both its ini- 
tial and final altitudes are zero. The extremal quantities sup{v} := max; y; and 
inf{v} := min; y; are called the height and depth of the path. 


A path can always be encoded by a word with a, b, c representing ascents, de- 
scents, and level steps, respectively. What we call the standard encoding is such a 
word in which each step a,b,c is (redundantly) subscripted by the value of the y- 
coordinate of its initial point. For instance, 


e 
WwW = Co do A] a2 b3 C2 C2 a2 b3 bz bi ag C4 


encodes a path that connects the initial point (0, 0) to the point (13, 1). Such a path 
can also be regarded as the evolution in discrete time of a walk over the integer line 
with jumps restricted to {—1, 0, +1}, or equivalently as a path in the graph: 


(48) 


Lattice paths can also be interpreted as trajectories of birth-and-death processes, where 
a population can evolve at any discrete time by a birth or a death. (Compare with the 
pure-birth case in (30), p. 312.) 
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As a preparation for later developments, let us examine the description of the 
class written Hh: of Motzkin excursions of height < 1. We have 
1 


ne = ~ SEQ(co) = Hii = 
1- co. 


The class of excursions of height < 2 is obtained from here by a substitution 
co FH co tao SEQ(c1)b1, 
to the effect that 


Hho) = SEQ (co + ao SEQ(C1)b1) 
1 _ l-c 


[<2] _ 
=> Ayo = = : 
aob, 1—co —c, + coc1 — aghy 


| ae eee 
= l-c\ 


Iteration of this simple mechanism lies at the heart of the calculations performed be- 
low. Clearly, generating functions written in this way are nothing but a concise de- 
scription of usual counting generating functions: for instance if individual weights® 
aj, Pj, yj ave assigned to the letters a;, bj, cj, respectively, then the OGF of multi- 
plicatively weighted paths with z marking length is obtained by setting 


(49) aj = @jZ, bj = Bjz, Cj = Yjz- 


The general class of paths of interest in this subsection is defined by arbitrary 
combinations of flooring (by m) ceiling (by h), as well as fixing initial (k) and final 
(J) altitudes. Accordingly, we define the following subclasses of the class H of all 
Motzkin paths: 


A = {we Hs ini(w) =k, fin(w) =1, m < inf{w}, sup{w} < h}. 
We shall also need the special cases: 
iS H pees Hz ae Sherr Hy [= Hb: Doe 


(Thus, the supercript indicates the condition that is to be satisfied iy all abscissae of 
vertices of the path.) Three simple combinatorial decompositions of paths (Figure V.8) 
then suffice to derive all the basic formulae. 


(i) Arch decomposition: An excursion from and to level 0 consists of a sequence 
of “arches”, each made of either a co or an agH= Np 1, So that 


(50) Ho, = SEQ (co Ua? "b bi) 


which relativizes to height < h. 
(ii) Last passages decomposition. Recording the times at which each level 0, ..., k 
is last traversed gives 


0 1 
(51) Ho = Hyp aon? May «++ ag HEE. 


8Throughout this chapter, all weights are assumed to be non-negative. 
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Figure V.8. The three major decompositions of lattice paths: the arch decomposition 
(top), the last passages decomposition (bottom left), and the first passage decomposi- 
tion (bottom right). 


(iii) First passage decomposition. The quantities H;,; with k < / are implicitly 
determined by the first passage through k in a path connecting level 0 to /, so that 


(52) Hor = Hogiiae-iHer &<d, 


(A dual decomposition holds when k > /.) 


The basic results of the theory express the generating functions in terms of a fun- 
damental continued fraction and its associated convergent polynomials. They involve 
the “numerator” and “denominator” polynomials, denoted by P, and Qj», that are de- 
fined as solutions to the second-order (or “three-term’”) linear recurrence equation 


(53) Yanai = 1 —ca)¥n — an—1bnYn-1, h > 0, 


together with the initial conditions (P_;, @-1;) = (—1,0), (Po, Qo) = (0, 1), and 
with the convention a_;bo = 1. In other words, setting C; = 1—c; and A; = a;_\b;, 
we have: 
(54) 

Pp=0, Pi =1, P2=Cj, P3 = C C2 — Ao 

Qo=1, Q1=Co, Q2=CoCi— Ai, Q3 = CoCiC2 — C2A1 — CoAz. 


These polynomials are also known as continuant polynomials [379, 601]. 


> V.14. Combinatorics of continuant polynomials. The polynomial Qj, is obtained by the fol- 
lowing process: start with the product IT := CoC, --- Cy_1; then cross out in all possible ways 
pairs of adjacent elements C;_ Cj, replacing each such crossed pair by —A;. For instance, 
Q4 is obtained as 

—A| —A2 =A —A,; —A3 


-—_ -_—_ aes 
CoC) C2C3 + Go C2C3 + Co E¢&o C3 + CoCy G23 + CoG 263 . 


The polynomials P;, are obtained similarly after a shift of indices. (These observations are due 
to Euler; see [307, §6.7].) <i 
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Proposition V.3 (Continued Fraction Theorem [214]). (@@) The generating function 
H,0 of all excursions is represented by the fundamental continued fraction: 


1 


55 H = 
(55) 0,0 aii 


l-—co- 
1 ab 
ee 
' a2b3 
l-—c- 


(ii) The generating function of ceiled excursion hee is given by a convergent of the 
fundamental continued fraction (55), with P,, Qn as in Equation (53): 
1 Ph 


(56) He = a 
is agb, Qh 


l—co- 
° abo 
l-c- 


1—ch-1 


(iii) The generating function of floored excursions is given by a truncation of the 
fundamental fraction: 


1 
h 
(57) Hae = - 
anbnsi 
1 = Ch > 

an+10n+42 
1 = ch4, — ——— 

(58) = 1 OnHo,0 — Ph 


2. 
an—1bp Qn-1Ho,0 — Ph-1 


Proof. Repeated use of the arch decomposition (50) provides a form of Hit with 


nested quasi-inverses (1 — f)~! that is the finite fraction representation (56); for in- 
stance, 


His! = Sze(co), Hig?! = SEa(co + ao SEQ(C1)b1), 
Hho’! & SEQ(co + ap SEQ(C1 + a1 SEQ(C2)b2)b1). 


The continued fraction representation for basic paths without height constraints (namely 
Ho,9) is then obtained by taking the limit h — oo in (56). Finally, the continued frac- 
tion form (57) for ceiled excursions is nothing but the fundamental form (55), when 
the indices are shifted. The three continued fraction expansions (55), (56), (57) are 
hence established. 

Finding explicit expressions for the fractions Bia and fe a next requires de- 
termining the polynomials that appear in the convergents of the basic fraction (55). 
By definition, the convergent polynomials P, and Q», are the numerator and denomi- 


nator of the fraction eee For the computation of Hit and Pr, Qn, one classically 
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introduces the linear fractional transformations 
1 
1 Cj —ajbjyiy’ 


si) = 
so that 
h 
(59) Hho) = 800810 820°+- 0 gn-1(0) and Ho,o = 80° 810 g20°:-,. 
Now, linear fractional transformations are representable by 2 x 2 matrices 


b 
(60) Soames (Brees 
cy+d c d 


in such a way that the composition corresponds to matrix product. By induction on 


the compositions that build up H} a there follows the equality 


Pn — Ph—ian—1bny 
Qh — Qn—-1dn—-1bny’ 


where Py, and Qj, are seen to satisfy the recurrence (53). Setting y = 0 in (61) 
proves (56). 


(61) 809 819 820° Bp-1(y) = 


Finally, jf is determined implicitly as the root y of the equation gg o---o 


&n—1(”) = Ao,o, an equation that, when solved using (61), yields the form (58). 


A large number of generating functions can be derived by similar techniques. We 
refer to the article [214], where this theory was first systematically developed and to 
the synthesis given in [303, Chapter 5]. Our presentation also draws upon [238] where 
the theory was put to use in order to develop a formal algebraic theory of general birth- 
and-death processes in continuous time. 


> V.15. Transitions and crossings. The lattice paths 19 7 corresponding to the transitions from 
altitude 0 to / and 7{;.9 (from k to 0) have OGFs 


1 1 
Ho; = —(Q;Ho0— FP). Hy.9 = —(OxHpo.0 — Px). 
0,1 B, (Qi Ho,0 — Pr) k,0 a, (2k 0,0 — Pr) 
: [<h] [<h] 
The crossings 79 ;,, and Hy 1 o have OGFs, 
An Bp 
Hyper = p> Bho = 
? Qh ; Qh 


(Abbreviations used here are: %m = ag-+-Gn—1, Bm = b,---bm.) These extensions pro- 
vide combinatorial interpretations for fractions of the form 1/Q. They result from the basic 
decompositions combined with Proposition V.3; see [214, 238] for details. dq 


> V.16. Denominator polynomials and orthogonality. Let Hy = [z"]Ho,o(z) represent the 
number of all excursions of length n equipped with non-negative weights. Define a linear 
functional £ on the space C(z) of polynomials by L[z”] = Hn. Introduce the reciprocal poly- 
nomials: Q;(z) = z’ Q(1/z). The fact, deducible from Note V.15, that Q) Hoo - Pi = O(z2!) 
corresponds to the property £L[z/Q,) = 0 for all 0 < j < J. In other words, the polynomials 
Q, are orthogonal with respect to the special scalar product ( f, g) := L[ fg]. (Historically, the 
theory of orthogonal polynomials evolved from the theory of continued fractions, before living 
a life of its own; see [118, 343, 563] for its many facets.) <i 
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> V.17. Discrete time birth-and-death processes. Assume that, at discrete timesn = 0,1, 2,..., 
a population of size j can grow by one element [a birth] with probability a;, decrease by one 
element [a death] with probability £;, and stay the same with probability y; = 1 — aj; — §;. 
Let wn be the probability that an initially empty population is again empty at time n. Then the 
GF of the sequence (wn) is 


Sone" = 


nz0 P= yoz = 


ag Biz 

ay Boz" 

1 — yjz — —— 
This result was found by I. J. Good in 1958: see [302]. J 


> V.18. Continuous time birth-and-death processes. Consider a continuous time birth-and- 
death process, where a transition from state j to j + 1 takes place according to an exponential 
distribution of rate 4; and a transition from j to j — 1 has rate wu ;. Let w(t) be the probability 
to be in state 0 at time ¢ starting from state 0 at time 0. One has 


PO" Se 1 1 
e° w(t)dt= = ; 
o stig- AoHI s+ a 40 
A 
statu - 1% 1+ _ 
S++ 


Thus, continued fractions and orthogonal polynomials may be used to analyse birth-and-death 
processes. (This fact was originally discovered by Karlin and McGregor [362], with later ad- 
ditions due to Jones and Magnus [358]. See [238] for a systematic discussion in relation to 
combinatorial theory.) dq 


V.4.2. Analytic aspects. We now consider the general asymptotic properties of 
lattice paths of height bounded from above by a fixed integer h > 1. Letters denoting 
elementary steps are weighted, as previously indicated, with 


aj = Qajz, bj => Piz; Cj = Vje> 


the weights being invariably non-negative. We shall limit the discussion to excursions, 
which are often the most interesting objects from the combinatorial point of view. 

As a preamble, in the Dyck case, where all y; are 0 (level steps are disallowed), 
the GF H!<"I is a function of z? only, since it takes an even number of steps to return 
to altitude 0 when starting from altitude 0. In such a case, we shall systematically 
assume that, when considering [z”]H [<h] | the index n = 2v is even. In order to 
avoid trivialities, we also assume that none of the coefficients attached to ascents and 


descents are 0. 


Theorem V.5 (Asymptotics of nested sequences). Consider the class Ty of weighted 


Motzkin excursions of height < h. In the non-Dyck case (at least one yj # 0), their 
number satisfies a pure exponential—polynomial formula, 


Ale = cBt He OC’), 


where B > OQ and0 < C < B. In the Dyck case, the formula holds, assuming 
furthermore that n = 0 (mod 2). 


V.4. NESTED SEQUENCES, LATTICE PATHS, AND CONTINUED FRACTIONS 325 


Proof. The proof proceeds by induction according to the depth of nesting of the 
sequence constructions, starting with the innermost construction. (The present dis- 
cussion is similar to the analysis of the supercritical sequence schema in Section V. 2, 
p. 293.) Write 


.— plh-j-lse<h] 
fj@) = i: aay eT Ca 


and let p; denote the dominant singularity of f; that is positive (existence is guaran- 
teed by Pringsheim’s Theorem). 

For ease of discussion, we first examine the case where all y; are non-zero. The 
function fo(z) is 


f@= 7: 
— Yh-1k 
and one has po = 1/yp—1. The function f| is given by 
Si) 
1Z) = ; 
1 = yn—2Z — On-2Bn—-127 fo(2) 


The quantity yp_2z + an—2Bn— 122 fo(z) in its denominator increases continuously 
from 0 to +00 as z increases from 0 to po; consequently, it crosses the value 1 at some 
point which must be ,. In particular, one must have p; < pg. Our assumption that 
all the y; are non-zero implies the absence of periodicities, so that p; is the unique 
dominant singularity. The argument can be repeated, implying that the sequence of 
radii is decreasing pp > ~1 > 2 > ---, the corresponding poles are all simple, and 
they are uniquely dominating. The statement is thus established in the case that all the 
yj are non-zero, 

Dually, in the Dyck case where all the y; are zero, one can reason in a similar 
manner, operating with the collection of “condensed” series fj (,/z), which are seen 
to have a unique dominant singularity. This implies that f;(z) itself has exactly two 
dominant singularities, namely pz, and — py, both being simple poles. 

In the mixed case, the f; are initially of the Dyck type, until a certain yn_|_ jy 4 0 
is encountered. In that case the function fj, is aperiodic (its span in the sense of Def- 
inition IV.5, p. 266, is equal to 1). The reasoning then continues in a similar manner 
to the Motzkin case, with all the subsequent f; (for j > jo) including fr_1(z) = 


H} HQ) having a unique dominant singularity. | 
Similar devices yield a characterization of the profile of a random path, that is, 
the number of times a given step appears in a random excursion. 


Theorem V.6 (Profile of nested sequences). Let X, be the random variable repre- 
senting the number of times a given step (of type aj, bj, or cj) with non-zero weight 
appears in a random excursion of length n and height < h. The moments of Xy satisfy 


E(X,) =cin +d; + O(D"), V(Xn) = can +. do + O(D"), 


for constants C1, C2, d\,d2, D, with c1,c2 > OandO0 < D < 1. In particular the 
distribution of Xj is concentrated. 


Proof. Introduce an auxiliary variable u marking the number of designated steps, and 
form the corresponding BGF H(z, u). We only detail the case of expectations. The 


326 V. APPLICATIONS OF RATIONAL AND MEROMORPHIC ASYMPTOTICS 


function H is a linear fractional transformation in u of the form 
A(z,u) = A(z) + ——__—-. 
Kee) @) C(z) + uD(z) 
(The coefficients A, B, C are a priori in C(z); they are in fact computable from Propo- 
sition V.3.) Then, one has 


— D(z) 
vet (C(Z) + D(z)?" 


This function resembles H(z, 1)*. An application of the chain rule permits us to verify 
that indeed 


a 
—H 
Ai (z, u) 


= E(z)H(z, 1), 


u=1 


0 

— H(z, 

ay te 
where F(z) is analytic in a disc larger than the disc of analyticity of H(z, 1). The 
analysis of the dominant double pole then yields the result. (The determination of the 
second moment follows along similar lines: a triple pole is involved.) | 


> V.19. All poles are real. Assume again a;8j;4; > 0 and y; > 0. By Note V.16, the 


denominator polynomials Q» are reciprocals of a family of polynomials Q), that are formally 
orthogonal with respect to a scalar product. Thus the zeros of any of the Qj, are all real, and so 
] 


are the zeros of Q,. Consequently: The poles of the OGF of ceiled excursions Hh are all 


real. (See for instance [563, §3.3] for the basic argument.) J 


V.4.3. Applications. Lattice paths have quite a wide range of descriptive power, 
especially when weights are allowed. We illustrate this fact by three types of exam- 
ples. 

Example V.8 provides a complete analysis of height in Dyck paths and general 
plane rooted trees, as regards moments as well as distribution. This is the simplest 
case of a continued fraction (one with constant coefficients) attached to the OGF of 
Catalan numbers and involving Fibonacci-Chebyshev polynomials. Example V.9 dis- 
cusses coin fountains. There, we are dealing with an infinite continued fraction to 
which the techniques of the previous subsection can be extended. (The developments 
take us close to the realm of g—calculus and to the analysis of alcohols seen in Chap- 
ter IV.) Example V.10 constitutes a typical application of the possibility of encoding 
combinatorial structures—here, interconnection networks—by means of lattice paths 
weighted by integers. The enumeration involves Hermite polynomials. (Other ex- 
amples related to set partitions and permutations are described in the accompanying 
notes.) 


Example V.8. Height of Dyck paths and plane rooted trees. In order to count lattice paths of 
the Dyck (D) or Motzkin (M) type, it suffices to effect one of the substitutions, 


OM: 4j}9Z, Dj Hy Z,cF HZ; op: aj Zz, bjt zcjH 0. 


We henceforth restrict attention to the case of Dyck paths. See Figure V.9 for three simulations 
suggesting that the distribution of height is somewhat spread. Given the parenthesis system 
representation (Note I.48, p.77), the height of a Dyck path automatically translates into as height 
of the corresponding plane rooted tree. 
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panes, SON A/a 


Figure V.9. Three random Dyck paths of length 2n = 500 have heights, respectively, 
20, 31, 24: the distribution is spread, see Proposition V.4. 


Expressions of GFs. The continued fraction expressing Hoo results immediately from 
Proposition V.3 and is in this case periodic (here, in the sense that its stages are all alike); it 
represents a quadratic function, 


Ho,o(2) = ——— =~ (1- v1- #2), 


Zz 2z 


since Ho, satisfies y = (1 22 y)~ | The families of polynomials P;,, Q» are in this case deter- 
mined by a recurrence with constant coefficients. Define classically the Fibonacci polynomials 
by the recurrence 


(62) Fy42(2) = Froi(z)—zFa@), Fo) =90, Fife) =1. 


One finds OQ, = Fhiy (22) and Py, = Fy (22). (The Fibonacci polynomials are reciprocals of 
Chebyshev polynomials; see Note V.20, p. 329.) By Proposition V.3, the GF of paths of height 
< his then 


Fy (2? 
Hl") _ n( J 
F412) 
(We get more and, for instance, the number of ways of crossing a strip of width h — 1 is 
Has (z) = zgh-1) Fhoi (z).) The Fibonacci polynomials have an explicit form, 
L(a—1)/2] 
h-1—-k 
Fs: ( ‘ Jat. 
k=0 


as follows from the generating function expression: >), Fr (z)y" =y/l—-yt+ zy’). 

The equivalence between Dyck paths and (general) plane tree traversals discussed in Chap- 
ter I (p. 73) implies that trees of height at most / and size n + 1 are equinumerous with Dyck 
paths of length 2n and height at most h. Set for convenience 


Gili eh 2s. Fh @) 
GOD) = ZAg  /) 2 @! 
which is precisely the OGF of general plane trees having height < h. (This is otherwise in 
agreement with the continued fraction forms obtained directly in Chapter III: cf (53), p. 195 
and (79), p. 216.) It is possible to go much further as first shown by De Bruijn, Knuth, and Rice 
in a landmark paper [145], which also constitutes a historic application of Mellin transforms in 
analytic combinatorics. (We refer to this paper for historical context and references.) 
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First, solving the linear recurrence (62) with z treated as a parameter yields the alternative 
closed form expression 


sh ——s 
G' -G 1-J1-4, — 1471-4 
(63) F(z) = =, es AEE G=—~ 
G-—G 2 2 
There, G(z) is the OGF of all trees, and an equivalent form of G!"I is provided by 
h 7 _d> 2 
—= 1-/1-4 G 
(64) CAC, ie es 
1—ul 14+ /1—4z z 


as is easily verified. Thus G!"] can be expressed in terms of G(z) and z: 


G — Gh) = T= 42 SGP, 


jzi 


The Lagrange—Biirmann inversion theorem (p. 732) then gives after a simple calculation 


h—2 2n 
(65) Gn41- GUE _ Ss a°( ; ), 


where 
2{ 2n 2n 2n 2n 
A = —2 + : 
n-—-m n+l—m n-m n—-l—m 


Consequently, the number of trees of height > h — 1 admits a closed form: it is a “sampled” 
sum, by steps of h, of the 2th line of Pascal’s triangle (upon taking second-order differences). 


Probability distribution of height. The relation (65) leads easily to the asymptotic distribu- 
tion of height in random trees of size n. Stirling’s formula yields the Gaussian approximation 
of binomial numbers: for k = o(n?/*) and with w = k/,/7, one finds 


(2) =) w*—3w2  5w’ — 54w® + 1354 — 60w2 
(66) > ~e 1- + ese pe 

(2) 6n 360n? 
The use of the Gaussian approximation (66) inside the exact formula (65) then implies: The 
probability that a tree of size n + 1 has height at least h — 1 satisfies uniformly for h € 
[a./n, B./n] (for any a, B such that 0 < a < B < 00) the estimate 


—2] 


[h 

Gny1-Giiy h 1 Beds ges 

ae anes ( ) ee (;) = OG)2= Diet Aix =2). 
Gn4l Jn n 2, 


The function ©(x) is a “theta function” which classically arises in the theory of elliptic func- 
tions [604]. Since binomial coefficients decay rapidly, away from the centre, simple bounds also 
show that the probability of the height being at least n!/2+€ decays as exp(—n2*), so that it is 
exponentially small. Note also that the probability distribution of height H itself admits an exact 
expression obtained by differencing (65), which is reflected asymptotically by differentiation of 
the estimate of (67): 

(68) 


1 1 : 
Pg. [A =lxvn\|= a” (x) +0 (<) ; @'(x) := y oF 12 72x =): 
jz 


The forms (67) and (68) also give access to moments of the distribution of height. We find 


r ee i pr atert re! 
Eg, .,L4"] =: (5). where S;(y) = 2" O' (hy). 
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Figure V.10. The limit density of the distribution of height —©/(x). 


The quantity y” +S) (y) is a Riemann sum relative to the function —x’©’(x), and the step 
y= n—'/? decreases to 0 as n > 00. Approximating the sum by the integral, one gets: 


CO 
Eg,,,[4”] ~ nl? uy, where fy := - | x" @! (x) dx. 
0 


The integral giving “1; is a Mellin transform in disguise (set s = r + 1) to which the treatment 
of harmonic sums applies. We then get upon replacing n + 1 by n: 


Proposition V.4. The expected height of a random plane rooted tree comprising n + | nodes is 


3 
(69) Jmn — 5 + o(1). 
More generally, the moment of order r of height is asymptotic to 
(70) yn"? where wy =r(r —DE(r/2)C(r). 


The random variable H/,/n obeys asymptotically a Theta distribution, in the sense of both the 
“central” estimate (67) and the “local” estimate (68). The same asymptotic estimates hold for 
height of Dyck paths having length 2n. 


The improved estimate of the mean (69) is from [145]. The general form of moments 
in (70) is in fact valid for any real r (not just integers). An alternative formula for the Theta 
function appears in Note V.20 below. Figure V.10 plots the limit density — ©’ (x), which surfaces 
again in the height of binary and other simple trees (Example VII.27, p. 535). ............ a 


> V.20. Height and Fibonacci-Chebyshev polynomials. The reciprocal polynomials Fy, (z) = 
Fy_-1(z) = zh-l ed /2°) are related to the classical Chebyshev polynomials by F;,(2z) = 
U;,(z), where Up (cos(@)) = sin((h + 1)0@)/sin(@). (This is readily verified from the recur- 
rence (62) and elementary trigonometry.) Then, the roots of Fj,(z) are (4cos* jx/(h + 1))7! 
and the partial fraction expansion of G!"I(z) can be worked out explicitly [145]. Thus, for 
n>1, 

= qntl _9 jx i710 
(71) ee? aa 2 sin’ “ cos?” oo 

1<j<h/2 

which provides in particular an asymptotic form for any fixed h. (This formula can also be 
found directly from the sampled sum (65) by multisection of series.) Asymptotic analysis of 
this last expression when h = x./n yields the alternative expression 


F = 72g 2sy2 
jim, Pon LH < xJ/n] = 479/23 we a: (=1- 0()), 
j20 
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which, when compared with (67), reflects an important transformation formula of elliptic func- 
tions [604]. See the study by Biane, Pitman, and Yor [64] for fascinating connections with 
Brownian motion and the functional equation of the Riemann zeta function. Height in simple 
varieties of trees also obeys a Theta law, but the proofs (Example VII.27, p. 535) require the 
full power of singularity analysis. 


> V.21. Motzkin paths. The OGF of Motzkin paths of height < h is re Dyl<hl (+): where 


Dy! ra refers to Dyck paths. Therefore, such paths can be enumerated exactly by formulae 
derived from Equations(65) to (71). Accordingly, the mean height is ~ /3zn. dq 


Example V.9. Area under Dyck path and coin fountains. Consider Dyck paths and the area 
parameter: area under a lattice path is taken here as the sum of the indices (i.e., the starting 
altitudes) of all the variables that enter the standard encoding of the path. Thus, the BGF D(z, q) 
of Dyck paths with z marking half-length and q marking area is obtained by the substitution 


ajh qlz, bid, cjH 0 


inside the fundamental continued fraction (55). (We rederive here Equation (54) of Chapter III, 
p. 196.) It proves convenient to operate with the continued fraction 


1 
(72) F(z, q) = ————__, 
pe 


2 
Zz 
ees 


so that D(z, q) = F(q7!z, q’). Since F satisfies a difference equation, 
1 
1 —zqF(qz,q)’ 
moments of area can be determined by differentiating and setting g = 1 (see Chapter II, p. 184, 
for a direct approach. 
A general trick from q—calculus is effective for deriving an alternative form of F. Express 


the continued fraction F of (72) as a quotient F(z, q) = A(z)/B(z). Then, the relation (73) 
implies 


(73) F(z,q) = 


A(z) _ 1 


ak 


and, by identifying numerators and denominators, we get 


A(z) = B(qz), B(z) = B(qz) — qzB(q*2), 


with q treated as a parameter. The difference equation satisfied by B(z) is then readily solved 
by indeterminate coefficients. (This classical technique was introduced in the theory of integer 
partitions by Euler.) With B(z) = >° bnz”, the coefficients satisfy the recurrence 


bo =1, bn = "bn — "byt. 
This is a first-order recurrence on b, that unwinds to give 


n 2 


(1 —q)(. —q?)---(—q") 


by = (-1)" 
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In other words, introducing the “q—exponential function’, 


oo nn 
(74) Ejqg=>- ae, where (q)n = (1 —q)(1—q7)---(—q"), 
n=0 7 
one finds 
_ Ez.) 
(75) F(z,q) = Gea) 


The exact distribution of area in Dyck paths can then be regarded as known, in the sense that 
it is fully characterized by (74) and (75). (Example VII.26, p. 533, presents an analysis of the 
corresponding limit distribution, based on “moment pumping”, to the effect that an Airy law 
prevails.) 

Given the importance of the functions under discussion in various branches of mathemat- 
ics, we cannot resist a quick digression. The name of the g—exponential comes form the obvious 
property that E(z(1 — q), q) reduces to e~* as g > 17. The explicit form (74) constitutes in 
fact the “easy half” of the proof of the celebrated Rogers—-Ramanujan identities, namely, 


CO 2 oo 
E(-1,q) = ae = ate a= gy _ gre 

(76) Oe ened tee 
E(-q,q) = Oe Laser") Gag ry, 

n=0 n=0 


that relate the g—exponential to modular forms. See Andrews’ book [14, Ch. 7] for context. 

Coin fountains. Here is finally a cute application of these ideas to the asymptotic enu- 
meration of some special polyominoes. Odlyzko and Wilf define in [461, 464] an (n, m) coin 
fountain as an arrangement of n coins in rows in such a way that there are m coins in the bottom 
row, and that each coin in a higher row touches exactly two coins in the next lower row. Let 
Cn,m be the number of (n, m) fountains and C(z, g) be the corresponding BGF with g mark- 
ing n and z marking m. Set C(q) = C(1, q). The question is to determine the total number of 
coin fountains of area n, [q”]C(q). The series starts as (this is EIS A005169) 


C(q)=14+4 +4" +29q7 + 3q* +5q> + 9q° + 15g’ + 2698 +---, 


as results from inspection of the first few cases. 


© & MHD ees coo ad 
There is a clear bijection with Dyck paths (do a 135° scan) that takes area into account: a 
coin fountain of size n with m coins on its base is equivalent to a Dyck path of length 2m and 
area 2n — m (with our earlier definition of area of Dyck paths). From this bijection, one has 
C(z, q) = F(z, g) (with F as defined earlier) and, in particular, C(q) = F(1, g). Consequently, 
by (72) and (75), we find 


1! _ £@.q) 


C(qQ= = Eq) 
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Objects weights (a;, Bjyj) counting orthogonal pol. 
Simple paths 1,1,0 Catalan # Chebyshev 
Permutations J+1,9,274+1 Factorial # Laguerre 
Alternating perm. J+1,7,0 Secant # Meixner 
Involutions 1, 7,0 Odd factorial # Hermite 

Set partition lj,jtl Bell # Poisson—Charlier 
Non-overlap. set part. 1,1, 7+1 Bessel # Lommel 


Figure V.11. Some special families of combinatorial objects together with cor- 
responding weights, counting sequences, and orthogonal polynomials. (See also 
Notes V.23— 25.) 


The rest of the discussion is analogous to Section IV. 7.3 (p. 283) relative to alcohols. The 
function C(q) is a priori meromorphic in |g| < 1. An exponential lower bound of the form 
1.6" holds for [g”]C(q), since (1 — q)/(1 — g — q”) is dominated by C(q) for g > 0. At the 
same time, the number [q”]C(q) is majorized by the number of compositions, which is 2”~!. 
Thus, the radius of convergence of C(q) has to lie somewhere between 0.5 and 0.61803 .... It 
is then easy to check by numerical analysis the existence of a simple zero of the denominator, 
E(1,q), near p = 0.57614. Routine computations based on Rouché’s theorem then make it 
possible to verify formally that p is the only pole in |g| < 3/5 and that this pole is simple (the 
process is detailed in [461]). Thus, singularity analysis of meromorphic functions applies. 


Proposition V.5. The number of coin fountains made of n coins satisfies asymptotically 
[q"IC(q) = cA" + O((5/3)"), c =0.31236, A= ioe = 1.73566. 


This example illustrates the power of modelling by continued fractions as well as the 
smooth articulation with meromorphic function asymptotics. .......... 2... cece ee eee eee | 


Lattice path encodings of classical structures. The systematic theory of lattice 
path enumerations and continued fractions was developed initially because of the need 
to count weighted lattice paths, notably in the context of the analysis of dynamic data 
structures in computer science [226]. In this framework, a system of multiplicative 
weights a;, 8;, yj is associated with the steps aj, b;,c;, each weight being an in- 
teger that represents a number of “possibilities” for the corresponding step type. A 
system of weighted lattice paths has counting generating functions given by the usual 
substitution from the corresponding multivariate expressions; namely, 


(77) aj ajZ, bj ed Bjz, Cj > Viz, 


where z marks the length of paths. One can then attempt to solve an enumeration 
problem expressible in this way by reverse-engineering the known collection of con- 
tinued fractions as found in reference books such as those by Perron [479], Wall [601], 
and Lorentzen—Waadeland [412]. Next, for general reasons, the polynomials P, Q are 
always elementary variants of a family of orthogonal polynomials that is determined 
by the weights (see Note V.16, p. 323, and [118, 563]). When the multiplicities have 
enough structural regularity, the weighted lattice paths are likely to correspond to 
classical combinatorial objects and to classical families of orthogonal polynomials; 
see [214, 226, 295, 303] and Figure V.11 for an outline. We illustrate this by a simple 
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codes beet 


Figure V.12. An interconnection network on 2n = 12 points. 


example due to Lagarias, Odlyzko, and Zagier [394], which is relative to involutions 
without fixed points. 


Example V.10. Interconnection networks and involutions. | The problem treated here is the 
following [394]. There are 2n points on a line, with n point-to-point connections between pairs 
of points. What is the probable behaviour of the width of such an interconnection network? 
Imagine the points to be 1,..., 2, the connections as circular arcs between points, and let a 
vertical line sweep from left to right; width is defined as the maximum number of arcs met by 
such a line. One may freely imagine a tunnel of fixed capacity (this corresponds to the width) 
inside which wires can be placed to connect points pairwise (Figure V.12). 

Let Yon be the class of all interconnection networks on 2n points, which is precisely the 
collection of ways of grouping 2n elements into n pairs, or, equivalently, the class of all invo- 
lutions without fixed points, i.e., permutations with cycles of length 2 only. The number J>,, 
equals the “odd factorial”, 


fi = le Sen A)s 


whose EGF is ee / 2 (see Chapter II, p. 122). The problem calls for determining the quantity 
git that is the number of networks having width < h. 

The relation to lattice paths is as follows. First, when sweeping a vertical line across a 
network, define an active arc at an abscissa as one that straddles that abscissa. Then build 
the sequence of active arc counts at half-integer positions i 3, wee, 2n — es 2n + a This 
constitutes a sequence of integers in which each member is +1 the previous one; that is, a 
lattice path without level steps. In other words, there is an ascent in the lattice path for each 
element that is smaller in its cycle and a descent otherwise. One may view ascents as associated 
to situations where a node “opens” a new cycle, while descents correspond to “closing” a cycle. 

Involutions are much more numerous than lattice paths, so that the correspondence from 
involutions to lattice paths has to be many-to-one. However, one can easily enrich lattice paths, 
so that the enriched objects are in one-to-one correspondence with involutions. Consider again 
a scanning position at a half-integer where the vertical line crosses ¢ (active) arcs. If the next 
node is of the closing type, there are € possibilities to choose from. If the next node is of 
the opening type, then there is only one possibility, namely, to start a new cycle. A complete 
encoding of a network is accordingly obtained by recording additionally the sequence of the n 
possible choices corresponding to descents in the lattice path (some canonical order is fixed, for 
instance, oldest first). If we write these choices as superscripts, this means that the set of all 
enriched encodings of networks is obtained from the set of standard lattice path encodings by 
effecting the substitutions 


i 
(k) 
bjre Di b;". 
k=1 
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Figure V.13. Three simulations of random networks with 2n = 1000 illustrate the 
tendency of the profile to conform to a parabola with height close to n/2 = 250. 


The OGF of all involutions is obtained from the generic continued fraction of Proposi- 
tion V.3 by the substitution 


aj Z, bj j-zZ, 


where z records the number of steps in the enriched lattice path, or equivalently, the number 
of nodes in the network. In other words, we have obtained combinatorially a formal continued 
fraction representation, 


Si 0-3-++ Qn = 1))2" = 
n=0 


which was originally discovered by Gauss [601]. Proposition V.3 also gives immediately the 
OGF of involutions of width at most h as a quotient of polynomials. Define 


Jz) = > vi eae 
n>0 


One has 


— Pn+il) 
1-22 Qn4i(2) 


l-hA- 22 


where P;, and Qy satisfy the recurrence 
Y; = Y, —hz’Y, 
h+1 h fth-1- 


The polynomials are readily determined by their generating functions that satisfies a first-order 
linear differential equation reflecting the recurrence. In this way, the denominator polynomials 
are identified to be reciprocals of the Hermite polynomials, 


~pnto.t 
Hep (z) = (2z)" Qn (—). 
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themselves defined classically [3, Ch. 22] as orthogonal with respect to the measure e* dx 
on (—oo, oo) and expressible via 


[m/2] j 
(—1)/m! m—-2j il 2xt—1? 
Hem (x) = >a jin 2p > Ss Hem () =€ . 
m=0 m>0 
In particular, one finds 
2 2 
Xe | EO 7) ec S| 1 Vee 
1-2 1 — 32? 1 — 622 + 3z4 


The interesting analysis of the dominant poles of the rational GFs, for any fixed h, is 
discussed in the paper [394]. Furthermore, simulations strongly suggest that the width of a ran- 
dom interconnection network on 2n nodes is tightly concentrated around n/2; see Figure V.13. 
Louchard [418] (see also Janson’s study [353]) succeeded in proving this fact and a good deal 
more. With high probability, the altitude (the altitude is defined here as the number of active 
arcs as time evolves) of a random network conforms asymptotically to a deterministic parabola 
2nx(1 — x) (with x € [0, 1]) to which are superimposed random fluctuations of a smaller am- 
plitude, O(./n), well-characterized by a Gaussian process. In particular, the width of a random 
network of 2n nodes converges in probability to n/2. 00.6... ccc ccc | 


> V.22. Bell numbers and continued fractions. With Sy, =n zee! a Bell number: 


> 822" = 


n>0 1-—1z- 


1 


122 


272 
tesOges 


[Hint: Define an encoding like for networks, with level steps representing intermediate elements 
of blocks [214].] Refinements include Stirling partition numbers and involution numbers. <J 


> V.23. Factorial numbers and continued fractions. One has 


1 
n_ 
Don = Pe 
n>0 = = 
1l-lz vi) 
Zz 
1 —3z — —— 


Refinements include tangent and secant numbers, as well as Stirling cycle numbers and Euler- 
ian numbers. (This continued fraction goes back to Euler [198]; see [214] for a proof based on 
a bijection of Frangon—Viennot [269] and Biane [63] for an alternative bijection.) <i 


> V.24. Surjection numbers and continued fractions. Let Rn = n'[z"](2 — e®)~!. Then 


= I 

Ryz" = 
~ iL 2.1272 
n=0 l-izg- 


2.2272 
eae Spee hee 
= Jes 


This continued fraction is due to Flajolet [216]. <i 
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[> V.25. The Ehrenfest? two-chambers model. (See Note II.11, p. 118 for context.) The OGF 
of the number of evolutions that lead to chamber A full satisfies 
N 
1 (x) 


[N] l< 
EWI," = 
Zee INZ 2N 2 1 — (N —2k)z 


n>0 i 


| 2 - 1)2? 


This results from the EGF of Note II.11 (p. 118), the Continued Fraction Theorem, and basic 
properties of the Laplace transform. (This continued fraction expansion is originally due to 
Stieltjes [562] and Rogers [516]. See also [304] for additional formulae.) 


V.5. Paths in graphs and automata 


In this section, we develop the framework of paths in graphs: given a graph, 
a source node, and a destination node, the problem is to enumerate all paths from 
the source to the destination in the graph. Non-negative weights acting multiplica- 
tively (probabilities, multiplicities) may be attached to edges. Applications include 
the analysis of walks in various types of graphs as well as languages described by 
finite automata. Under a fundamental structural condition, known as irreducibility and 
corresponding to strong-connectedness of the graph, generating functions of paths all 
have the same dominant singularity, which is a simple pole. This essential property im- 
plies simple exponential forms for the asymptotics of coefficients (possibly tempered 
by explicit congruence conditions in the periodic case). The corresponding results can 
equivalently be formulated in terms of the set of eigenvalues (the spectrum) of the cor- 
responding adjacency matrix and are related to the classical Perron—Frobenius theory 
of non-negative matrices—under irreducibility, only the largest positive eigenvalue 
matters asymptotically. 


V.5.1. Combinatorial aspects. A directed graph or digraph T is determined by 
the pair (V, E) of its vertex set V and its edge set E C V x V. Here, self-loops 
corresponding to edges of the form (v, v) are allowed. Given an edge, e = (a,b), 
we denote its origin by orig(e) := a and its destination by destin(e) := b. For Ta 
digraph with vertex set identified to the set {1, ..., m}, we allow each edge (a, b) to be 
weighted by a quantity gq,y, which we may take as a formal indeterminate for which 
we allow the possibility of substituting positive weight values; the matrix G such that 


(78) Gab = 8a,p if the edge (a,b) € T, G,.» = 0 otherwise, 


is called the weighted adjacency matrix of the (weighted) graph [’ (Figure V.14). The 
usual adjacency matrix of I is obtained by the substitution gy, + 1. 


A path is a sequence of edges, w = (e€1,...,€n), Such that, for all 7 with 1 < 
j <n, one has destin(e;) = orig(e;+1). The parameter n is called the length of the 
path and we define: orig(@) := orig(e1), destin(@) := destin(e,). A circuit is a 


path whose origin and destination are the same vertex. Note that, with our definition, 
a circuit has its origin that is distinguished. We do not identify here two circuits 
such that one is obtained by circular permutation from the other: the circuits that we 
consider, with such a distinguished root, are rooted circuits. 
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‘ 0 gi2 OO gia 
_ i 2 = 0 0 g23 O 
nS : ee 83,1 0 0 0 ; 
3 0 g42 9 0 


FD (z) = 14+ 91,299,383,12° + 81,484,282,383,12) + °° - 


Figure V.14._ A graph I, its formal adjacency matrix G, and the generating 
function F"!-!)(z) of paths from | to 1. 


From the standard definition of matrix products, the powers G” have elements 
that are path polynomials. More precisely, one has the simple but essential relation, 


(79) (Oe ene 


we Fi) 


where et is the set of paths in I that connect i to j and have length n, and a path w 
is identified with the monomial in indeterminates {g;, ;} that represents multiplicatively 
the succession of its edges; for instance: 


3 
Gip= DL Br vaBrrvsBes.ve 


V1 =1,V2,V3,Va=j 


In other words: powers of the matrix associated to a graph generate all paths in 
graph, the weight of a path being the product of the weights of the individual edges 
it comprises. (This fact probably constitutes the most basic result of algebraic graph 
theory [66, p. 9].) One may then treat simultaneously all lengths of paths (and all 
powers of matrices) by introducing the variable z to record length. 


Proposition V.6. (i) Let T be a digraph and let G be the formal adjacency matrix 
of T as given by (78). The OGF F‘“/)(z) of the set of all paths from i to j in T, with 
z marking length and gq.» the weight associated to edge (a, b), is the entry i, j of the 
matrix (I — zG)~!; namely 


l, 
(80) FQ) =(G-267') = Cnt a. 
i,j A(z) 
where A(z) = detU — zG) is the reciprocal polynomial of the characteristic polyno- 
mial of G and A‘J-')(z) is the determinant of the minor of index j,i of I — zG. 
(ii) The generating function of (rooted) circuits is expressible in terms of a loga- 
rithmic derivative: 
/ 
(81) FHM) =a 2 
L 


A(z) 
In this algebraic statement, if one takes the {g,,,} as formal indeterminates, then 


FJ) (z) is a multivariate GF of paths in z with the variable {Za,b} marking the num- 
ber of occurrences of edge (a, b). The result applies, in particular, to the case where 
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the gq,» are assigned numerical values, in which case [z”]F-/)(z) becomes the to- 
tal weight of paths of length n, which we also refer to as “number of paths” in the 
weighted graph. 
Proof. For the proof, it is convenient to assume that the quantities g,., are assigned 
arbitrary real numbers, so that usual matrix operations (triangularization, diagonaliza- 
tion, and so on) can be easily applied. As the properties expressed by the statement 
are ultimately equivalent to a collection of multivariate polynomial identities, their 
general validity is implied by the fact that they hold for all real assignments of values. 
Part (7) is a consequence of the fundamental equivalence between paths and ma- 
trix products (79), which implies 


FS) (z) = ye (G"); = (u oo :6)"') 
n=0 


Pee: 
i,j 
and from the cofactor formula of matrix inversion. 

Part (ii) results from elementary properties of the matrix trace? functional. With m 
the dimension of G and {/1, ..., 4m} the multiset of its eigenvalues, we have 


m m 
(82) ys Fe) — TG" = >> a, 
qa) 


i=1 


where Fibs = [<”]F-/) (z). Upon taking a generating function, there results that 
m 


m oo a diez 
(83) fe aa ae 


i=1 n=1 j=1 
which, up to a factor of —z, is none other than the logarithmic derivative of A(z). 1 


> V.26. Positivity of inverses of characteristic polynomials. Let G have non-negative coef- 
ficients. Then, the rational function Zg(z) := 1/det(J — zG) has non-negative Taylor co- 
efficients. More generally, if G = (gq,p) is a matrix in the formal indeterminates gq », then 
[z”]Zg(z) is a polynomial in the gq, with non-negative coefficients. (Hint: The proof proceeds 
by integration from (81): we have, for 1/A(z), the equivalent expressions 


1 [ A‘ (t) ) iE et piel dt zn 
—— =exp(- dt) = ex (F(t) — 1) —) = ex —TrG"), 
AG) Ch 26 ge ac sa bead Pe 
i=l n>1 
which ensure positivity of the coefficients of Zg.) dq 
[> V.27. MacMahon’s Master Theorem. Let J be the determinant 
1—z1811 22812 —Zm81m 
—21821 1—z2822 --- —Zm82m 
J (215-265 2m) = : oa : 
—<m&Sm1 —Zm82m ++: 1—Zm8&mm 
MacMahon’s “Master Theorem” asserts the identity of coefficients, 
1 
[ett ++ 2m" |——— = [zy ++ am 71+ Yn, where Sy gic. 
J (Z15-++5 2m) : : 
Tf His an m x m matrix with multiset of eigenvalues {41,..., {4m}, the trace is defined by Tr H := 


7, (H)j; and, by triangularization (Jordan form), it satisfies Tr H = 2A Hj- 
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This result can be obtained by a simple change of variables in a multivariate Cauchy integral and 
is related to multivariate Lagrange inversion [303, pp. 21-23]. Cartier and Foata [105] provide 
a general combinatorial interpretation related to trace monoids of Note V.10, p. 307. dq 


> V.28. The Jacobi trace formula. this trace formula [303, p. 11] for square matrices is 
(84) det o exp(M) = expo Tr(M); 


equivalently, with due care paid to determinations: logo det(M) = Trolog(M). It generalizes 


the scalar identities ee? = e@+° and log ab = loga + log b. (Hint: recycle the computations 
of Note V.26.) 


> V.29. Fast computation of the characteristic polynomial. The following algorithm is due 
to Leverrier (1811-1877), the astronomer and mathematician who, together with Adams, first 
predicted the position of the planet Neptune. Since, by (82) and (83), one has 


it is possible to deduce an algorithm that determines the characteristic polynomial of a matrix 
of dimension m in O(m*) arithmetic operations. [Hint: computing the quantities Tr G/ for 
j =1,...,m is sufficient and requires precisely m matrix multiplications. ] dq 


> V.30. The Matrix Tree Theorem. Let T be a directed graph without loops and associated 
matrix G, with g,, , the weight of edge (a, b). The Laplacian matrix L[G] is defined by 


L(G]; ; = -8i,; + = Jl, where 6; := >) gi,k- 
k 


Let L;[G] be the matrix obtained by deleting the first row and first column of L[G]. Then, the 
“tree polynomial” 


T,[G] := det L,[G] 


enumerates all (oriented) spanning trees of I rooted at node 1. (This classic result belongs to a 
circle of ideas initiated by Kirchhoff, Sylvester, Borchardt and others in the nineteenth century. 
See, for instance, the discussions by Knuth [377, p. 582-583] and Moon [445].) <i 


Weighted graphs, word models, and finite automata. The numeric substitution 
o : 8a,b +» | transforms the formal adjacency matrix G of I into the usual adja- 
cency matrix. In particular, the number of paths of length n is obtained, under this 
substitution, as [z”](1 — zG)~!. As already noted, it is possible to consider weighted 
graphs, where the gq» are assigned positive real-valued weights; with the weight of a 
path being defined by the product of its edge weights. One finds that [z”](J — zG)7! 
equals the total weight of all paths of length n. If furthermore the assignment is made 
in such a way that >° » 8a,b = 1, for alla, then the matrix G, which is called a stochas- 
tic matrix, can be interpreted as the transition matrix of a Markov chain. Naturally, 
the formulae of Proposition V.6 continue to hold in all these cases. 


Word problems corresponding to regular languages can be treated by the theory 
of regular specifications whenever they have enough structure and an unambiguous 
regular expression description is of tractable form. (This is the main theme of Sub- 
section I. 4.1, p. 51, further pursued in Sections V.3 and V. 4.) The dual point of view 
of automata theory introduced in Subsection I. 4.2 (p. 56) proves useful whenever no 
such direct description is in sight. Finite automata can be reduced to the theory of 
paths in graphs, so that Proposition V.6 is applicable to them. Indeed, the language L 
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accepted by a finite automaton A, with set of states Q, initial state go, and Q ¢ the set 
of final states, decomposes as 
Le S FWD 


qeQr 


where F‘%-7) is the set of paths from the initial state qo to the final state, g. (The 
corresponding graph I is obtained from A by collapsing multiple edges between any 
two vertices, i and j, into a single edge equipped with a weight that is the sum of 
the weights of all the letters leading from i to j.) Proposition V.6 is then clearly 
applicable. 


Profiles. The term “profile of a set of paths”, as used here, means the collection 
of the m2? statistics N = (Ni,1,--.;Nm,m) where Nj;,; is the number of times the 
edge (i —> /) is traversed. This notion is, for instance, consistent with the notion of 
profile given earlier for lattice paths in Section V. 4. It also contains the information 
regarding the letter composition of words in a regular language and is thus compatible 
with the notion of profile introduced in Section V. 3. 

Let I’ be a graph with edge (a, b) weighted by y,,». Then, the BGF of paths with 
u marking the number of times a particular edge (c, d) is traversed is in matrix form 


UT — zG)"!, with G =G [22.5 > ee 


The entry (i, j) in this matrix gives the BGF of paths with origin i and destination /. 
The GF of cumulated values (moments of order 1) is then obtained in the usual way, 
by differentiation followed by the substitution vu = 1. Higher moments are similarly 
attainable by successive differentiations. 


V.5.2. Analytic aspects. In full generality, the components of a linear system 
of equations may exhibit the whole variety of behaviours obtained for the OGFs of 
regular languages in Section V.3, p. 300. However, positivity coupled with some 
simple ancillary conditions (irreducibility and aperiodicity defined below) entails that 
the GFs of interest closely resemble the extremely simple rational function, 

1 _ 1 
l—z/p 1—Ayz’ 
where p is the dominant positive singularity and 2; = 1/p is a well-characterized 
eigenvalue of T. Accordingly, the asymptotic phenomena associated with such sys- 
tems are highly predictable and coefficients involve the pure exponential form c- p~”. 
We propose first to expound the general theory, then treat classical applications to 
statistics of paths in graphs and languages recognized by finite automata. 


Irreducibility and aperiodicity of matrices and graphs. From this point on, we 
only consider matrices with non-negative entries. Two notions are essential, irre- 
ducibility and aperiodicity (the terms are borrowed from Markov chain theory and 
matrix theory). 

For A a scalar matrix of dimension m x m (with non-negative entries), a crucial 
role is played by the dependency graph (p. 33); this is the (directed) graph with vertex 
set V = {1..m)} and edge set containing the directed edge (a — b) iff Ag, # 0. 
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Figure V.15. Irreducibility conditions. Left: a strongly connected digraph. Right: a 
weakly connected digraph that is not strongly connected decomposes as a collection 
of strongly connected components linked by a directed acyclic graph. 


The reason for this terminology is the following: Let A represent the linear transfor- 


mation { y= > j Ai i} ; then, the fact that an entry A;,; is non-zero means that 
° l 

yy depends effectively on y; and is translated by the directed edge (i — j) in the 

dependency graph. 


Definition V.5. The non-negative matrix A is called irreducible if its dependency 
graph is strongly connected (i.e., any two vertices are connected by a directed path). 


By considering only simple paths, it is then seen that irreducibility is equivalent to 
the condition that (J + A)” has all its entries that are strictly positive. See Figure V.15 
for a graphical rendering of irreducibility and for the general structure of a (weakly 
connected) digraph. 


Definition V.6. A strongly connected digraph T is said to be periodic with parameter 
d iff the vertex set V can be partitioned into d classes, V = VoU---U Va-1, in sucha 
way that any edge whose source is an element of a V; has its destination in Vj+1 mod d- 

The largest possible d is called the period. If no decomposition exists with d > 2, 
so that the period has the trivial value 1, then the graph and all the matrices that admit 
it as their dependency graph are called aperiodic. 


For instance, a directed 10-cycle is periodic with parameters d = 1,2,5, 10 
and the period is 10. Figure V.16 illustrates the notion. Periodicity implies that the 
existence of paths of length n between any two given nodes 7, j is constrained by the 
congruence class n mod d. Conversely, aperiodicity entails the existence, for all n 
sufficiently large, of paths of length n connecting i, /. 
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Figure V.16. Periodicity notions: the overall structure of a periodic graph with d = 4 
(left), an aperiodic graph (middle) and a periodic graph of period 2 (right). 


From the definition, a matrix A with period d has, up to simultaneous permutation 
of its rows and columns, a cyclic block structure 


0 Avil, 0 0 

0 0 As 

0 0 0 ses | Ag—2,d-1 
Agag| 2 0 Je 6 


where the blocks A; ;+1 reflect the connectivity between V; and V;+1. In the case of a 
period d, the matrix A@ admits a diagonal square block decomposition where each of 
its diagonal block is aperiodic (and of a smaller dimension than the original matrix). 
Then, the matrices A’? can be analysed block by block, and the analysis reduces to the 
aperiodic case. Similarly for powers A’¢+" for any fixed r as v varies. In other words, 
the irreducible periodic case with period d > 2 can always be reduced to a collection 
of d irreducible aperiodic subproblems. For this reason, we usually postulate in our 
statements both an irreducibility condition and an aperiodicity condition. 
> V.31. Sufficient conditions for aperiodicity. Any one of the following conditions suffices to 
guarantee aperiodicity of the non-negative matrix T: 
(i) T has (strictly) positive entries; 
(ii) some power T* has (strictly) positive entries; 
(iii) T is irreducible and at least one diagonal element of T is non-zero; 


(iv) T is irreducible and the dependency graph of T is such that there exist two circuits 
(closed paths) that are of relatively prime lengths. 


(Any such condition implies in turn the existence of a unique dominant eigenvalue of T, which 
is simple, according to Theorem V.7 and Note V.34 below.) dq 


> V.32. Computability of the period. There exists a polynomial time algorithm that determines 
the period of a matrix. (Hint: in order to verify that T° is periodic with parameter d, develop a 
breadth-first search tree, label nodes by their level, and check that edges have endpoints satis- 
fying suitable congruence conditions modulo d.) dq 
Paths in strongly connected graphs. For analytic combinatorics, the importance 
of irreducibility and aperiodicity conditions stems from the fact that they guarantee 
uniqueness and simplicity of a dominant pole of path generating functions. 
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Theorem V.7 (Asymptotics of paths in graphs). Consider the matrix 
FQ@)=U-2T), 


where T is a scalar non-negative matrix, in particular, the adjacency matrix of a 
graph Y equipped with positive weights. Assume that T is irreducible. Then all entries 
FJ) (2) of F(z) have the same radius of convergence p, which can be defined in two 
equivalent ways: 


(i) asp= A with A, the largest positive eigenvalue of T ; 
(ii) as the smallest positive root of the determinantal equation: det —zT) = 0. 


Furthermore, the point p = ae is a simple pole of each Fis) (z). 


If T is irreducible and aperiodic, then p = Aa is the unique dominant singularity 
of each F“-)(z), and 


[2"] FS) (z) = 9j,j47 + O(A"), 0<A<Al, 
for computable constants 9;,; > 0. 


Proof. The proof proceeds by stages, building up properties of the F/) by means 
of the relations that bind them, with suitable exploitation of Proposition V.6, p. 337 in 
conjunction with Pringsheim’s Theorem (p. 240). In parts(i)—(v), we assume that the 
matrix T is aperiodic. Periodicity is finally examined in part (vi). 


(i) All F“-/) have the same radius of convergence. Simple upper and lower 
bounds show that each F'"/) has a finite non-zero radius of convergence pi,j- By 
Pringsheim’s Theorem, this p;,; is necessarily a singularity of the function F J), 
Since each FJ) is a rational function, it then has a pole at p;,;, hence becomes infi- 
nite as z — p;,;. Now, the matrix F satisfies the identities 


(85) F=14+7¢TF, ad F=1+2FT. 


Thus, given that T is irreducible, each F‘’/) is positively (linearly) related to any 
other F". Then, the F/) must all become infinite as soon as one of them does. 
Consequently, all the p;,; are equal—we let p denote their common value. 


(ii) All poles are of the same multiplicity. By a similar argument, we see that all 
the F“-/) must have the same multiplicity « of their common pole p, since otherwise, 
one function would be of slower growth, and a contradiction would result with the 
linear relations stemming from (85). We thus have, for some g;,; > 0 and x > 1: 


i,j Gi,j 
Fes Zz ~ 
® (1 — z/p)* 
(iii) The common multiplicity of poles is x = 1. This property results from 
the expression of the GF of all rooted circuits (Proposition V.6, Part (ii)) in terms of a 
logarithmic derivative, which has by construction only simple poles. Hence, a positive 
linear combination of some of the F'’-/) has only a simple pole, so that « = 1 and 


i,j Qi, j 
86 FUNG) ~ 
(86) Seats eee 
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Another consequence is that we have p = 1//1, where 1; is an eigenvalue of matrix 
T, which then satisfies the property that 2; > |A| for any eigenvalue 4 of T: in matrix 
theory terminology, such an eigenvalue is called dominant!®. 


(iv) There are positive dominant eigenvectors. From the relations (85) satisfied 
by the F-/) (z) with j fixed and from (86), one finds as z > p 


i,j li, kk, j 
87 ———_ ~ - where T = (7; ;). 
oe 1—<z/p Deer (71.i) 


This expresses the fact that the column vector (91,;,..., @m,j)' is a right eigenvector 
corresponding to the eigenvalue 2; = p~!. Similarly, for each fixed i, the row vec- 
tor (9i,1,--- i,m) is found to be a left eigenvector. By part (ii), these eigenvector 
have all their components strictly positive. 


(v) The eigenvalue 4, is simple. This property is needed in order to identify the 
gi,; coefficients. We base our proof on the Jordan normal form and simple inequalities. 

Assume first that there are two different Jordan blocks corresponding to the eigen- 
value 1,. Then there exist two vectors, v = (v1,..., 0m)! and w = (wj,..., Wm)’, 
such that 


Tv =A, Tw =u, 


where we may assume that the eigenvector v has positive coordinates, given part (iv). 
Let jo be an index such that 


Wj Wj 
Jil a wil 


V jg j=l..m Oj , 
By possibly changing w to —w and by rescaling, we may freely assume that wj, = 
D jg Also, since v and w are not collinear, there must exist j; such that |wj,| < vj,- 
In summary: 


(88) Wy =Pjor — WAl <p» Viz wl < 9j. 


Consider finally the two relations T”v = Aj'v and T”w = A‘'w, and examine con- 
sequences for the jo components. One has 


m m 
(89) i= Ups, wp => Upan 
k=1 k=1 


where each U;,x, the (j,k) entry of T”, is positive, by the irreducibility and aperi- 
odicity assumptions. But then, by the triangle inequality, there is a contradiction be- 
tween (89) and (88). Thus, there cannot be two distinct Jordan blocks corresponding 
to A). 

It only remains to exclude the existence of a Jordan block of dimension > 2 
associated with 2. If such a Jordan block were present, there would exists a vector w 


10Tn matrix theory, a dominant eigenvalue (A) is one that is Jargest in modulus, while, for an analytic 
function, a dominant singularity (p) is one that is smallest in modulus. The two notions are reconciled by 
the fact that singularities of generating functions are inverses of eigenvalues of matrices (p = 1/11). 
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such that 


To =2A,w TD = Ay" v, 


me Tw = Ajw+o HOPE 2 pias = Avy + vmaly, 


By simple bounds obtained from comparing w to v componentwise, it is found that 
the vector T”’”w must have all its coordinates that are O(A}’"). Upon taking v — oo, 
a contradiction is reached with the last relation of (90), where the growth of these 
coordinates is of the form v1}. Thus, a Jordan block of dimension > 2 is also 
excluded, and the eigenvalue 1, is simple. 


(vi) Aperiodicity of T is equivalent to the existence of a unique dominant eigen- 
value. If A; uniquely dominates, meaning that 2; > |A| for all eigenvalues 2 # 11, 
then each F-/) has a simple pole at p that is its unique dominant singularity. Hence 
the coefficients [z”]F-/)(z) are non-zero for n large enough, since they are asymp- 
totic to g;,;p~” by (86). This last property ensures aperiodicity. 

Conversely, if T is aperiodic, then 4; uniquely dominates. Indeed, suppose that 
His an eigenvalue of T such that |u| = 41, with w a corresponding eigenvector. We 
would have Tv = A1'v and Tw = ww. But then, by an argument similar to the 
one used in part (v), upon making use of inequalities (88), we would need to have w 
and v collinear, which is absurd. 

We leave it as an exercise to the reader to verify the stronger property that identi- 
fies the period with the number of dominant eigenvalues: see Note V.33. | 


Several of these arguments will inspire the discussion, in Chapter VII, of the 
harder problem of analysing coefficients of algebraic functions defined by positive 
polynomial systems (Subsection VII. 6.3, p. 488). 


> V.33. Periodicities. If T has period d, then the support of each F (iJ) (z) is included in dZ, 
hence there are at least d conjugate singularities, corresponding to eigenvalues of the form 


44 e2ik/d There are no other eigenvalues since T@ is built out of irreducible blocks, each with 
the unique dominant eigenvalue ee dq 


> V.34. The classical Perron—Frobenius Theorem. The proof of Theorem V.7 immediately 
gives the following famous statement. 


Theorem (Perron—Frobenius Theorem). Let A be a matrix with non-negative elements 
that is assumed to be irreducible. The eigenvalues of A can be ordered in such a way that 


A, = |Aol=--> = Val > Yagil = Yaqel 2: 


and all the eigenvalues of largest modulus are simple. Furthermore, the quantity d is precisely 
equal to the period of the dependency graph. In particular, in the aperiodic case d = 1, there 
is unicity of the dominant eigenvalue. In the periodic case d > 2, the whole spectrum has a 
rotational symmetry: it is invariant under the set of transformations 


Rene Ff HO. EAL 


The properties of positive and of non-negative matrices have been superbly elicited by Per- 
ron [478] in 1907 and by Frobenius [271] in 1908-1912. The corresponding theory has far- 
reaching implications: it lies at the basis of the theory of finite Markov chains and it extends 
to positive operators in infinite-dimensional spaces [390]. Excellent treatments of Perron— 
Frobenius theory are to be found in the books of Bellman [34, Ch. 16], Gantmacher [276, 
Ch. 13], as well as Karlin and Taylor [363, p. 536-551]. <i 
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> V.35. Unrooted circuits. Consider a strongly connected weighted graph I’ with adjacency 
matrix G = (g;,;). Let RC be the class of all rooted circuits and PRC the subclass of those 
that are primitive (i.e., they differ from all their cyclic shifts). Let also U/C be the class of all 
unrooted circuits (no origin distinguished) and PUC the subclass of those that are primitive. 
Define the adjacency matrix GO* := ((gi, ) obtained by raising each entry of G to the sth 
power. Set finally Ag(z) := det(J — zG). We find 


z d 
RC(,.G) = > PRCE,G%,  PUCGG) = [ prcu.e 
0 
k>1 
UCG,G) = > PUCG,G%, 
k>1 


upon mimicking the reasoning of Appendix A.4: Cycle construction, p. 729. This results in 


UC(z) = > “) log (1/Agox(z)) . 


k>1 
n an 


2 
["UC@=1+0(A), Et PUC@ = ++ 010"), 


where the two asymptotic estimates hold under irreducibility and aperiodicity conditions. These 
estimates can be regarded as a Prime Number Theorem for walks in graphs. (See [555] for 
related facts and zeta functions of graphs.) 


Profiles. The proof of Theorem V.7 additionally provides the form of a certain 
“residue matrix”, from which several probabilistic properties of paths follow. 


Lemma V.1 (Iteration of irreducible matrices). Let the non-negative matrix T be ir- 
reducible and aperiodic, with A, its dominant eigenvalue. Then the residue matrix ® 
such that 


co) 
(91) (-2zT)' =——— +00) @2347’) 
1— za, 
has entries given by ({x, y) represents the scalar product >); xi Yi) 
= rej 
me ey? 


where r and € are, respectively, right and left eigenvectors of T corresponding to the 
eigenvalue Aj. 
Proof. We have seen that the matrix ® = (g;, ;) has its rows and columns proportional, 
respectively, to right and left eigenvectors belonging to the eigenvalue 1;. Thus, we 
have 

i,j _ Pi,l 

Pig PL 
while the g1,; (respectively, g;,1) are the coordinates of a left (respectively, right) 
eigenvector. There results that there exists a normalization constant ¢ such that 


i,j = STit;. 


That normalization constant is then determined by the fact that the GF of circuits has 
residue equal to p = a at z = p, so that >’; g;,; = 1, leading to 


Nath Sa ete? 
j 
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which implies the statement. | 


Equipped with the lemma, we can now state: 


Theorem V.8 (Profiles of paths in graphs). Let G be a non-negative matrix associ- 

ated to a weighted digraph 1, assumed to be irreducible and aperiodic. Let €,r be, 

respectively, the left and right eigenvectors corresponding to the dominant (Perron— 

Frobenius) eigenvalue 41. Consider the collection F\“”) of (weighted) paths in T with 

fixed origin a and final destination b. Then, the number of traversals of edge (s, t) in 
(a,b) 

a random element of F;,’ ' has mean 


€s&s.tlt 
ry. 


(92) Ts,n + O(1) where Ts := 


In other words, a long random path tends to spend asymptotically a fixed (non-zero) 
fraction of its time traversing any given edge. Accordingly, the number of visits to 
vertex s is also proportional to n and obtained by summing the expression of (92) over 
all the possible values of ft. 


Proof. First, the total weight (“number”) of paths in F, » satisfies 


rath 
93 "fa-26'] ~ ea, 
(93) [<"]] U — zG) a eat 
as follows from Lemma V.1. Next, introduce the modified matrix H = (h;,;) defined 
by 

hj j = 8i,j Pic ltemtas aad 

In other words, we mark each traversal of edge i, 7 by the variable u. Then, the 
quantity 

cdl 


represents the total number of traversals of edge (s,¢), with weights taken into ac- 
count. Simple algebra!! shows that 


(94) [z"] Eg — 2H)! 
Ou 


= (I —zG)! (cH’) (I - zG), 


u=1 


(95) LG — zH)"! 
Ou 


where H’ := (0,H),,—; has all its entries equal to 0, except for the s,t entry whose 
value is gs;. By the calculation of the residue matrix in Lemma V.1, the coefficient 
of (94) is then asymptotic to 


Pa,s Pt,b rats8s ity 
96 ze - 5tZ “_ ~ pnd’, 0:= 
Oe) Toa 1—A,z (€,r)? 
Comparison of (96) and (93) finally yields the result since the relative error terms are 
O(n7') in each case. | 
‘lf A is an operator depending on u, one has éy(A7!) = —A7!(@,A)A7!, which is a non- 


commutative generalization of the usual differentiation rule for inverses. 


348 V. APPLICATIONS OF RATIONAL AND MEROMORPHIC ASYMPTOTICS 


Another consequence of this last proof and Equation (93) is that the numbers of 
paths starting at a and ending at either b or c satisfy 


_ Reg, 
(97) lim ——~=—. 
n->0o Fe bo 


In other words, the quantity 
ep 
> j ej 
is the asymptotic probability that a random path with origin fixed at some point a but 
otherwise unconstrained will end up at point b after a large number of steps. Such 


properties are strongly evocative of Markov chain theory discussed below in Exam- 
ple V.13, p. 352. 


> V.36. Residues and projections. Let E = C™ be the ambient space, where m is the dimen- 
sion of T, assumed to be irreducible and aperiodic. There exists a direct sum decomposition 
E = F, ® F> where F; is the one-dimensional eigenspace generated by the eigenvector (r) 
corresponding to eigenvalue 2; and F> is the supplementary space which is the direct sum of 
characteristic spaces corresponding to the other eigenvalues 17, .... (For the purposes of the 
present discussion, one may freely think of the matrix as diagonalizable, with Fz the union of 
eigenspaces associated to 27, ....) Then T as a linear operator acting on F admits the decom- 
position 
T= Ay P+S, 

where P is the projector on F; and S acts on Fy with spectral radius ||, as illustrated by the 
diagram: 


(98) 


Fy 


By standard properties of projections, P? = P and PS = SP = 0 so that T” = AP + S$". 
Consequently, there holds, 


(99) Ga2fy = > (CaP s)= RP SEB 
1-2 ,z 
n>0 
Thus, the residue matrix ® coincides with the projector P. 
From this, one finds also 
® k 

(100) ery = et (ea!) Re SE apt, 

1-—24z 

k>0 

which provides a full expansion. dq 


> V.37. Algebraicity of the residues. One only needs to solve one polynomial equation in 
order to determine 2,. Then the entries of ® and the R, in (100) are all obtained by rational 
operations in the field generated by the entries of T extended by the algebraic quantity 21: for 
instance, in order to get an eigenvector, it suffices to replace one of the equations of the system 
Tr = Ar by a normalization condition, like r; + +--+ 7m = 1. (Numerical procedures are 
likely to be used instead for large matrices.) dq 
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Automata and words. By proposition V.6 (p. 337), the OGF of the language de- 
fined by a deterministic finite automaton is expressible in terms of the quasi-inverse 
(1 — zT)~!, where the matrix T is a direct encoding of the automaton’s transitions. 
Corollary V.7 and Lemma V.1 have been precisely custom-tailored for this situation. 
We shall allow weights on letters of the alphabet, corresponding to a Bernoulli model 
on words. We say that an automaton is irreducible (respectively, aperiodic) if the 
underlying graph and the associated matrix are irreducible (respectively, aperiodic). 


Proposition V.7 (Random words and automata). Let £ be a language recognized by 
a deterministic finite automaton A whose graph is irreducible and aperiodic. The 
number of words of L satisfies 


Ly ~ cdi + OCA"), 


where A, is the dominant (Perron—Frobenius) eigenvalue of the transition matrix of A 
and c, are real constants withc > Oand0 < A < dj. 

Ina random word of Ly, the number of traversals of a designated vertex or edge 
has a mean that is asymptotically linear in n, as given by Theorem V.8. 
> V.38. Unambiguous automata. A non-deterministic finite state automaton is said to be unam- 
biguous if the set of accepting paths for any given words comprises at most one element. The 


translation into a generating function as described above also applies to such automata, even 
though they are non-deterministic. dq 


> V.39. Concentration of distribution for the number of passages. Under the conditions of 
the theorem, the standard deviation of the number of traversals of a designated node or edge 
is O(./n). Thus in a random long path, the distribution of the number of such traversals is 
concentrated. [Compared to (95), the calculation of the second moment requires taking a further 
derivative, which leads to a triple pole. The second moment and the square of the mean, which 


are each O(n’), are then found to cancel to main asymptotic order.] dq 


V.5.3. Applications. We now provide a few application of Theorems V.7 and V.8. 
First, Example V.11 studies briefly the case of words that are locally constrained in 
the sense that certain transitions between letters are forbidden; Example V.12 revisits 
walks on an interval and develops an alternative matrix view of a problem otherwise 
amenable to continued fraction theory. Next, Example V.13 makes explicit the way 
the fundamental theorem of finite Markov chain theory can be derived effortlessly as a 
consequence of the more general Theorem V.8, and Example V.14 compares on a sim- 
ple problem, the devil’s staircase, the combinatorial and the Markovian approaches. 
Example V.15 comes back to words and develops simple consequences of an impor- 
tant combinatorial construction, that of De Bruijn graphs. This graph is invaluable in 
predicting in many cases the shape of the asymptotic results that are to be expected 
when confronted with word problems; Finally, Example V.16 concludes this section 
with a brief discussion of the special case of words with excluded patterns, thereby 
leading to a quantitative version of Borges’ Theorem (Note 1.35, p. 61). 

In all these cases, the counting estimates are of the form cA”, whereas the expec- 
tations of parameters of interest have a linear growth. 


Example V.11. Locally constrained words. Consider a fixed alphabet A = {a1,..., am} anda 
set F C A? of forbidden transitions between consecutive letters. The set of words over A with 
no such forbidden transition is denoted by L and is called a locally constrained language. (The 
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Ore 
ooor 
rPoOoro 
Le a 


Figure V.17. Locally constrained words: The transition matrix (7) associated to the 
forbidden pairs F = {ac, ad, bb, cb, cc, cd, da, db}, the corresponding automaton, 
and the graph with widths of vertices and edges drawn in proportion to their asymp- 
totic frequencies. 


particular case where exactly all pairs of equal letters are forbidden corresponds to Smirnov 
words and has been discussed on p. 262.) 

Clearly, the words of £ are recognized by an automaton whose state space is isomorphic 
to A: state q simply memorizes the fact that the last letter read was a g. The graph of the au- 
tomaton is then obtained by the collection of allowed transitions (q,r) 1» a, with (q,r) ¢ F. 
(In other words, the graph of the automaton is the complete graph in which all edges that corre- 
spond to forbidden transitions are deleted.) Consequently, the OGF of any locally constrained 
language is a rational function. Its OGF is given by 


(1,1,...,D00 —zT)7!0,1,..., D4, 


where 7;; is 0 if (aj, a;) € F and | otherwise. If each letter can occur later than any other letter 
in an accepted word, the automaton is irreducible. Also, the graph is aperiodic except in a few 
degenerate cases (e.g., in the case where the allowed transitions would be a > b,c; b > d; 
c > d;d —> a). Under irreducibility and aperiodicity, the number of words must be ~ cA 
and each letter has on average an asymptotically constant frequency. (See (34) and (35) of 
Chapter IV, p. 262, for the case of Smirnov words.) 

For the example of Figure V.17, the alphabet is A = {a, b, c, d}. There are eight forbidden 
transitions and the characteristic polynomial yg (A) := det(AJ — G) is found to be Pus 2). 
Thus, one has 2; = 2. The right and left eigenvectors are found to be 


r = (2,2,1, 1), €=(2,1,1, 1). 


Then, the matrix 7, where ts; represents the asymptotic frequency of transitions from letter s 
to letter t, is found in accordance with Theorem V.8: 


II 
O col oo AI 
oo OFF 
al- cal-o 
ale Calo 


This means that a random path spends a proportion equal to 1/4 of its time on a transition 
between an a and a J, but much less (1/16) on transitions between pairs of letters bc, bd, cc, ca. 
The letter frequencies in a random word of £ are (1/2, 1/4, 1/8, 1/8), so that an a is four times 
more frequent than a c or a d, and so on. See Figure V.17 (right) for a rendering. 
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Various specializations, including multivariate GFs and non-uniform letter models, are 
readily treated by this method. Bertoni ef al. [59] develop related variance and distribution 
calculations for the number of occurrences of a symbol in an arbitrary regular language. ... Hf 


Example V.12. Walks on the interval. As a direct illustration, consider the walks associated 
to the graph ['(5) with vertex set 1,...,5 and edges being formed of all pairs (i, 7) such that 
|i — j| < 1. The graph I'(5) and its incidence matrix G(5) are 


OED, we, 
1 7 3 4 b) 


The characteristic polynomial 7G 5)(z) := det(zI — G(5)) factorizes as 


1G(5) (2) = 22 — I) — 2)(2? — 22 — 2), 


and its dominant root is 4; = 1 + /3. From here, one finds a left eigenvector (which is also a 
right eigenvector since the matrix is symmetric): 


r=¢ =(1,V3,2, V3, 1). 


Thus a random path (with the uniform distribution over all paths corresponding to the weights 
being equal to 1) visits nodes 1, ..., 5 with frequencies proportional to 


L. RGaoe 88 


ooorrF 


OoOrRrR eR 
OrRRRO 
ReEHOO 


RPrROCO 


implying that the non-extremal nodes are visited more often—such nodes have higher degrees 
of freedom, so that there tend to be more paths that traverse them. 

In fact, this example has structure. For instance, the graph ['(11) defined by an interval of 
length 10, leads to a matrix with a highly factorable characteristic polynomial 


xo) =2@ - 1) @-2) (2 - 22-2) (2 -22-1) (4-423 +227 442-2). 


The reader may have recognized here a particular case of lattice paths, which is covered by the 
theory presented in Section V. 4, p. 318. Indeed, according to Proposition V.3, the OGF of paths 
from vertex 1 to vertex 1 in the graph I'(k) with vertex set {1,..., k} is given by the continued 
fraction 


{=7= 
oe ee 


(The number of fraction bars is k.) From this it can be shown that the characteristic polynomial 
of G is an elementary variant of the Fibonacci—Chebyshev polynomial of Example V.8, p. 326. 
The analysis based on Theorem V.8 is simpler, albeit more rudimentary, as it only provides a 
first-order asymptotic solution to the problem. 

This example is typical: whenever combinatorial problems have the appropriate amount of 
regularity, all the resources of linear algebra are available, including the vast body of knowledge 
gathered over years on calculations of structured determinants, which is well summarized in 
Krattenthaler’s survey [391] and the book by Vein and Dale [594]. ....................0. | 
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Figure V.18. The devil’s staircase (m = 6) and the two matrices that can model it. 


Example V.13. Elementary theory of finite Markov chains. Consider the case where the row 
sums of matrix G are all equal to 1, that is, )> j 8i,j = 1. Such a matrix is called a stochastic 
matrix. The quantity g;,; can then be interpreted as the probability of leaving state i for state j, 
assuming one is in state 7. Assume that the matrix G is irreducible and aperiodic. Clearly, the 
matrix G admits the column vector r = (1, 1,..., 1)! as a right eigenvector corresponding to 
the dominant eigenvalue 14; = 1. The left eigenvector normalized so that its elements sum 
to 1 is called the (row) vector of stationary probabilities. It must be calculated by linear algebra 
and its determination involves finding an element of the kernel of matrix J — G, which can be 
done in a standard way. 
Theorem V.8 and Equation (93) immediately imply the following: 


Proposition V.8 (Stationary probabilities of Markov chains). Consider a weighted graph cor- 
responding to a stochastic matrix G which is irreducible and aperiodic. Let € be the normalized 
left eigenvector corresponding to the eigenvalue 1. A random (weighted) path of length n with 
fixed origin and destination visits node s a mean number of times asymptotic to €sn and tra- 
verses edge (s,t) a mean number of times asymptotic to €sgs,zn. A random path of length n 
with fixed origin ends at vertex s with probability asymptotic to €s. 


The vector @ is also known as the vector of stationary probabilities. The first-order asymp- 
totic property expressed by Proposition V.8 certainly constitutes the most fundamental result in 
the theory of finite Markov chains. .......... 0... cece cece eee eee een een e ene enee | 


Example V.14. The devil’s staircase. This example illustrates an elementary technique often 
employed in calculations of eigenvalues and eigenvectors. It presupposes that the matrix to be 
analysed can be reduced to a sparse form and has a sufficiently regular structure. 

You live in a house that has a staircase with m steps. You come back home a bit loaded 
and at each second, you can either succeed in climbing a step or fall back all the way down. On 
the last step, you always stumble and fall back down (Figure V.18). Where are you likely to be 
found at time n? 

Precisely, two slightly different models correspond to this informally stated problem. The 
probabilistic model views it as a Markov chain with equally likely possibilities at each step and 
is reflected by matrix G in Figure V.18. The combinatorial model just assumes all possible 
evolutions (“histories”) of the system as equally likely and it corresponds to matrix G. We opt 
here for the latter, keeping in mind that the same method basically applies to both cases. 

We first write down the constraints expressing the joint properties of an eigenvalue 2 and 
its right eigenvector x = (x1,...,Xm)’. The equations corresponding to (AJ — G)x = 0 are 
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formed of a first batch of m — 1 relations, 


(101) Q—1)x}-x%2 =0, -xy +Ax.-— x3 =0, +++ —xy + AXm—1 — Xm =O, 
together with the additional relation (one cannot go higher than the last step): 

(102) —x, +Axm = 0. 

The solution to (101) is readily found by pulling out successively x2, ..., Xm as functions of x1: 
(103) #= C= DH, eH] OP AH Day. He pS GS I ae 1: 


Combined with the special relation (102), this last relation shows that 2 must satisfy the equation 

(104) 1-24" 4 4mt! — 9, 

Let 1, be the largest positive root of this equation, existence and dominance being guaranteed 

by Perron—Frobenius properties. Note that the quantity » := 1/Aj satisfies the characteristic 

equation 

already encountered when discussing longest runs in words; the discussion of Example V.4 then 

grants us the existence of an isolated p near i, hence the fact that 2, is slightly less than 2. 
Similar devices yield the left eigenvector y = ()1,..., Ym). It is found easily that y; must 


be proportional to ed . We thus obtain from Theorem V.8 and Equation (97): The probability 
of being in state j (i.e., being on step j of the stair) at time n tends to the limit 


oy gad 
oF a sa 
where A is the root near 2 of the polynomial (104) and the normalization constant y is deter- 
mined by >- j @j = 1. In other words, the distribution of the altitude at time n is a truncated 


geometric distribution with parameter 1/41. For instance, m = 6 leads to 1; = 1.98358, and 
the asymptotic probabilities of being in states 1,..., 6 are 


(105) 0.50413, 0.25415, 0.12812, 0.06459, 0.03256, 0.01641, 


exhibiting a clear geometric decay. Here is the simulation of a random trajectory for n = 100: 


In this case, the frequencies observed are 0.44, 0.26, 0.17, 0.08, 0.04, 0.01, pretty much in 
agreement with what is expected. 

Finally, the similarity with the longest run problem in words is easily explained. Let u 
and d be letters representing steps upwards and downwards, respectively. The set of paths from 
state 1 to state 1 is described by the regular expression 


* 
Pii= (d+ud +-+--+u"!d) d 


corresponding to the generating function 
1 


Piz) = Pe a 


> 


a variant of the OGF of words without m-—runs of the letter u, which also corresponds to the 
enumeration of compositions with summands < m. (The case of the probabilistic transition 
matrix G is left as an exercise to the reader.) 2... 0... cece ccc ect e eee ceene eee n ne eees || 
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Example V.15._ De Bruijn graphs. Two thieves want to break into a house whose entrance 
is protected by a digital lock with an unknown four-digit code. As soon as the four digits of 
the code are typed consecutively, the gate opens. The first thief proposes to try in order all the 
four-digit sequences, resulting in as much as 40 000 key strokes in the worst-case. The second 
thief, who is a mathematician, says he can try all four-digit combinations with only 10003 key 
strokes. What is the mathematician’s trade secret? 

Clearly certain optimizations are possible: for instance, for an alphabet of cardinality 2 
and codes of two letters, the sequence 00110 is better than the naive one, 0001 1011, which 
is redundant; a few more attempts will lead to an optimal solution for three-digit codes that has 
length 10 (rather than 24), for instance, 


0001110100. 


The general question is then: How far can one go and how to construct such sequences? 

Fix an alphabet of cardinality m. A sequence that contains as factors (contiguous blocks) 
all the k letter words is called a de Bruijn sequence. Clearly, its length must be at least d(m, k) = 
m* +k — 1, as it must have at least m* positions at distance at least k — 1 from the end. A 
sequence of smallest possible length 6(m, k) is called a minimal de Bruijn sequence. Such 
sequences were discovered by N. G. de Bruijn [140] in 1946, in response to a question coming 
from electrical engineering, where all possible reactions of a device presented as a black box 
must be tested at minimal cost. We shall treat here the case of a binary alphabet, m = 2, the 
generalization to m > 2 being obvious. 

Let € = k—1 and consider the automaton Ge that memorizes the last block of length ¢ read 
when scanning the input text from left to right. A state is thus assimilated to a string of length ¢ 
and the total number of states is 2“. The transitions are easily calculated: let g € {0, 1} be 
a state and let o (w) be the function that shifts all letters of a word w one position to the left, 
dropping the first letter of w in the process (thus o maps {0, 1} to {0, 1}@—!); the transitions 
are 


0 1 
qh o(q)0, qv oq). 


If one further interprets a state g as the integer in the interval [0.. 2° — 1] that it represents, then 
the transition matrix assumes a remarkably simple form: 


T;,; = 1G = 2i mod 2°) or (j = 21 + 1 mod 2°)]. 


See Figure V.19 for a rendering borrowed from [263]. 

Combinatorially, the de Bruijn graph is such that each node has indegree 2 and outdegree 2. 
By a well known theorem going back to Euler: A necessary and sufficient condition for an 
undirected connected graph to have an Eulerian circuit (that is, a closed path that traverses 
each vertex exactly once) is that every node has even degree. For a strongly connected digraph, 
the condition is that each node has an outdegree equal to its indegree. This last condition is 
obviously satisfied here. Take an Eulerian circuit starting and ending at node 0°; its length is 
2¢+! — 2k Then, clearly, the sequence of edge labels encountered when prefixed with the word 
ok—! — 0° constitutes a minimal de Bruijn sequence. In general, the argument gives a de Brujin 
sequence with minimal length m* +k —1. Et voila! The trade secret of the thief-mathematician 
is exposed. 


Back to enumeration. The de Bruijn matrix is irreducible since a path labelled by suffi- 
ciently many zeros always leads any state to the state 0°, while a path ending with the letters 
of w € {0, 1} leads to state w. The matrix is aperiodic since it has a loop on states 0° and 1°. 
Thus, by Perron—Frobenius properties, it has a unique dominant eigenvalue, and it is not hard to 
check that its value is 4; = 2, corresponding to the right eigenvector (1, 1,..., 1)’. If one fixes 
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Figure V.19. The de Bruijn graph: (left) € = 3; (right) € = 7. 


a pattern w € {0, 1}£, Theorem V.8 yields back the known fact that a random word contains 
on average ~ 24 occurrences of pattern w, while Note V.39, p. 349, further implies that the 
distribution of the number of occurrences is concentrated around the mean, as the variance is 
O(n). The de Bruijn graph may be used to quantify many properties of occurrences of patterns 
in random words: see for instance [43, 240, 263]. «02... cc cece cece eee e eee e eens | 


Example V.16. Words with excluded patterns. Fix a finite set of patterns Q = {w1,..., wr}, 
where each w; is a word of A*. The language € = € © of words that contain no factor in Q is 
described by the extended regular expression 


- 
E=A*\ LU (A*w;A*), 
jai 
which constitutes a concise but highly ambiguous description. By closure properties of regular 
languages, € is itself regular and there must exist a deterministic automaton that recognizes it. 

An automaton recognizing € can be constructed starting from the de Bruijn automaton of 
index k = —1+max |w | and deleting all the vertices and edges that correspond to a word of Q. 
Precisely, vertex g is deleted whenever q contains a factor in Q; the transition (edge) from q 
associated with letter a gets deleted whenever the word ga contains a factor in Q. The pruned 
de Bruijn automaton, call it B?, accepts all words of OK €, when it is equipped with the initial 
state OK and all states are final. Thus, the OGF E (z) is in all cases a rational function. 

The matrix of By is the matrix of the de Bruijn graph B, with some non-zero entries re- 
placed by 0. Assume that B? is irreducible. This assumption only eliminates a few pathological 
cases (e.g., Q = {01} on the alphabet {0, 1}). Then, the matrix of B? admits a simple Perron— 
Frobenius eigenvalue 1,. By domination properties (Q 4 0), we must have 1; < m, where m 
is the cardinality of the alphabet. Aperiodicity is automatically granted. We then get by a purely 
qualitative argument: The number of words of length n excluding patterns from the finite set Q. 
is, under the assumption of irreducibility, asymptotic to c(A,/m)", for some c > Oand A, <m. 
This gives us in a simple manner a strong version of what has been earlier nicknamed “Borges’s 
Theorem” (Note V.35, p. 61): Almost every sufficiently long text contains all patterns of some 
predetermined length €. 

The construction of a pruned automaton is clearly a generalization of the case of words 
obeying local constraints in Example V.11 above. ........... 0... c cece eect eee e eens | 
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Transfer matrix method. Let C be a combinatorial class to be enumerated. 


(i) Determine a collection C1, C2,...,Cm of classes, with C} = C such that the following 
system of equation holds: 
(106) Cpe OS Oat T, FS, 2s.c22m, 

ke{1,2,...,m} 


where each Q; , and each Z; is a finite class. 
(ii) The OGF C(z) = Cj(z) is then given by the solution of the linear system 


(107) Cj@) = DQ L@OOA@+1Y@,  F=hL-.ym, 

J 
where Qj, (z) and J; (z) are the generating polynomials of Q;,, and Z;, respectively. Accord- 
ingly, C(z) is a C[z]-linear combination of entries of the quasi-inverse matrix (J — Q(z))71. 


Figure V.20. A summary of the basic transfer matrix method. 


> V.40. Walks on undirected graphs. Consider an undirected graph I, where one moves by 
following at each step a random edge of the graph, uniformly at random from the current posi- 
tion. Then, the transition matrix P = (p;;) of the associated Markov chain is: p;,; = 1/deg(i) 
if (i, j) is an edge, where deg(i) is the degree of vertex i. The stationary distribution is given 
by 2; = (deg(i))/(2|| E]]), where || E|| is the number of edges of I. In particular, if the graph 
is regular, the stationary distribution is uniform. (See Aldous and Fill’s forthcoming book [11] 
for (much) more.) 


> V.41. Words with excluded patterns and digital trees. Let S be a finite set of words. An 
automaton recognizing S, considered as a finite language, can be constructed as a tree. The tree 
obtained is akin to the classical digital tree or trie that serves as a data structure for maintaining 
dictionaries [378]. A modification of the construction yields an automaton of size linear in the 
total number of characters that appear in words of S. [Hint. The construction can be based on 
the Aho—Corasick automaton [5, 538]). J 


V.6. Transfer matrix models 


There exists a cluster of applications of rational functions to problems that are nat- 
urally described as paths in digraphs, but with edges that may be of different sizes. In 
physics, such models lie at the heart of what is known as the “transfer matrix method”. 
Technically, the theory is a simple extension of the standard case of paths in graphs 
developed in Section V. 5. Its main interest lies in its expressiveness as regards a num- 
ber of combinatorial problems, including trees of bounded width, partial models of 
self-avoiding walks, and certain constrained permutation problems. 


V.6.1. Combinatorial aspects. The transfer matrix method constitutes a variant 
of the modelling by deterministic automata and by paths in standard graphs. The 
general framework is summarized in Figure V.20. The idea is to set up a system 
of linear equations that relate a cleverly crafted collection of classes (“states”) C;, 
which are of the same nature as the original class C that is to be enumerated. The 
combinatorial system (106) in Figure V.20 can then be visualized as a graph, with 
the objects of the Q; ; classes attached to edges (“transitions between states”) being 
generally of different sizes. 
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Definition V.7. Given a directed multigraph T with vertex set V and edge set E, a 
size function on T is any functiono : E > Zs. A sized graph is a pair (G,o), 
where o is a size function. 


Paths are defined in the same way as in Section V.5. The length of a path is, as 
usual, the number of edges it comprises; the size of a path is defined to be the sum of 
the sizes of its edges. As in the basic case treated in the previous section, we also allow 
edges to carry positive weights (multiplicities, probability coefficients), the weight of 
a path being the product of the weights of its edges. 

Definition V.8. A matrix T (z) is a transfer matrix if each of its entries is a polynomial 
in z with non-negative coefficients. A transfer matrix T(z) is said to be proper if T (0) 
is nilpotent, that is, T(0)’ = 0 for some r > 1. 


Examples of transfer matrices are 


(ii) (2 se). 


and both are proper. For the graphs and automata considered in Section V. 5, all edges 
were taken to be of unit size. In that case, the associated (weighted) adjacency matrices 
are invariably of the form T(z) = zS, with S a scalar matrix having non-negative 
entries, and thus are very particular cases of proper transfer matrices. 

Given a sized graph [ equipped with weight function w : E — Rso (with 
w(e) = | in the pure enumerative case), we can associate to it a transfer matrix T (z) 
as follows: 


(108) TeoGy= >). raledz", 


ecEdge(a,b) 


NIE BIE 
NIH Alu 


There, Edge(a, b) represents the set of all edges connecting a to b; w(e) and |e| = 
o (e) represent, respectively, the weight and the size of edge e. The matrix T (z) whose 
(a, b)-entry is the polynomial T, ,(z), as given in (108), is called the transfer matrix 
of the (weighted, sized) graph. By Definition V.7, the transfer matrix of a sized graph 
is always proper. Since T(z)” describes all paths in the graph with z marking size, 
the proof techniques of Proposition V.6 (p. 337) immediately provide: 


Proposition V.9. Given a sized graph with associated transfer matrix T (z), the OGF 
FJ) (2) of the set of paths from i to j, where z marks size, is the entry i, j of the 
matrix (I — T(z))7!: 
FQ) =(C-TeY'), 
i,j 

V.6.2. Analytic aspects. In order to apply the general results from Section V.5 
to transfer matrices, we must first take note of an easy reduction of transfer matrices 
to the standard case of paths in graphs where all edges have size 1. 

Given a sized graph I, one can build as follows a standard graph G where all 
edges of G have unit size. The set of vertices of G is the set of vertices of I augmented 
by additional vertices called relay nodes. For each edge e of size o(e) = min T, 
introduce m — | additional relay nodes and connect these in G by a simple path from 
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a to b, with edges all of size 1. Here is for instance the transcription of an edge of 
length 4 in I by means of three relay nodes in G: 


ow  %o ot TOs, 


Clearly, the vertices of I are a subset of the vertices of G andall paths of T correspond 
to paths of G. Let T be the (scalar) adjacency matrix of [. Then, the quasi-inverse 
dU - zT)"! describes all the paths in I’, with size taken into account, in the sense that 
the entry of index (i, 7) in this quasi-inverse is the OGF of paths from node numbered i 
to node numbered j in the sized graph T. 

This construction permits us to apply the main results of Section V.5 to transfer 
matrices and sized graphs. Let us say that the sized graph I and its transfer matrix 
T (z) are irreducible (respectively, aperiodic) if G and T are irreducible (respectively, 
aperiodic). We can then immediately transcribe Theorems V.7 and V.8 as follows. 


Corollary V.1. (i) Consider a sized graph Y that is irreducible and aperiodic. Then, 
there exist a computable constant A, and numbers 9;,; such that the OGF of paths 
from i to j inT satisfies 


(109) [FY (2) = 9 jan + O(A"), OSA <AL. 


(ii) In a random path from a to b of large size, the number of occurrences of a 
designated edge (s, t) is asymptotically 


(110) @s,.n+ O(1), 
for a computable constant Ws. 


Thus, on general grounds, the behaviour of paths is predictable. The notes be- 
low explore some further properties that make it possible to operate directly with the 
transfer matrix and the sized graph, without necessitating the explicit construction of 
T and G. 
> V.42. Irreducibility for sized graphs. The sized graph TI is irreducible if and only if the graph 
Gy, where all edges of I are taken to be of size 1 is strongly connected. The transfer matrix 
T (z) of I is irreducible (in the sense above) if and only if T (1) is irreducible in the usual sense 
of scalar transfer matrices. dq 
> V.43. Aperiodicity for sized graphs. A polynomial p(z) = 2); cj2°/, with every cj # 0, 
is said to be primitive if the quantity 6 = gcd({e;}) is equal to 1; it is imprimitive otherwise. 
Equivalently, p(z) is imprimitive iff p(z) = q(2°) for some bona fide polynomial g and some 
6 > 1. An irreducible sized graph is aperiodic (in the sense above) if and only if at least one 


diagonal entry of some power T(z)* is a primitive polynomial. Equivalently: there exist two 
circuits of the same length, whose sizes, s1, 52, satisfy gcd(s,, 57) = 1. J 


> V.44. Direct determination of the asymptotic growth constant. Let T be a sized graph as- 
sumed to be irreducible and aperiodic. Then, one has 24; = 1/p, where p is the smallest 
positive root of det(J — T(z)) = 0, with T(z) the transfer matrix of T. J 


V.6.3. Applications. The quantitative properties summarized by (109) and (110) 
apply with full strength to classes that are amenable to the transfer matrix method. We 
shall first illustrate the situation by the width of trees following an early article by 
Odlyzko and Wilf [463], then continue with an example that draws its inspiration 
from the insightful exposition of domino tilings and generating functions in the book 
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z 2 23 
2z 322 423 
3z 622 10z3 


Figure V.21. The sized graph corresponding to general plane trees of width at most 3 
and its transfer matrix. (For readability, the transitions from a node to itself are omit- 
ted.) 


of Graham, Knuth, and Patashnik [307], and conclude with an exactly solvable poly- 
omino model. 


Example V.17. Width of trees. The width of a tree is defined as the maximal number of nodes 
that can appear on any layer at a fixed distance from the root. If a tree is drawn in the discrete 
plane, then width and height can be seen as the horizontal and vertical dimensions of a bounding 
rectangle. Also, width is an indicator of the complexity of traversing the tree in breadth-first 
search (by a queue), while height is associated to depth-first search (by a stack). 

Transfer matrices are ideally suited to the problem of analysing the number of trees of fixed 
width. Consider a simple variety of trees YV corresponding to the equation Y(z) = z(Y(z)), 
where the “generator” ¢ describes the basic formation of trees (Proposition I.5, p. 66). Let 
C := yl! be the subclass of trees of width at most w. Such trees are easily built layer by layer. 
Indeed, with reference to our general description of the transfer matrix method at the begin- 
ning of the section, let us introduce a collection of classes C,, where each Cy, (kK = 1,..., w) 
comprises all trees of width < w having exactly k nodes at the deepest level. We then have 
C= >| Cy (this is a trivial variant of the case considered in our general description). Thus 
the states of the transfer matrix model, equivalently the nodes of the sized graph, correspond to 
the number of nodes on the deepest layer of the tree. The transition between configurations C; 
corresponding to state j and configurations C; corresponding to state k is effected by grafting in 
all possible ways a forest of j trees, of total height equal to 1, having k leaves. See Figure V.21 
for the case of width w = 3. 

The number of j-forests of depth 1 having k leaves is the quantity 


tik = lu Io(yy. 
Let T be the w x w matrix with entry Tj,4 = ages Then, clearly, the quantity zi (T"); j (with 


1 < i,j < w) is the number of i-forests of height and width at most w, having j nodes on 
level h. Thus, the GF of Y-trees having width at most w is 


ylel(z) = (z,0,0,...) —T)701, 1, 1,..)*. 
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For instance, in the case of general Catalan trees, the matrix T has the shape, 


) 2Q 2@ a 
mo 2 20 20 40 
TP O=) 2 24 26 49 |’ 


3) 2G) 2G) “G) 


for width 4. The analysis of dominant poles provides asymptotic formulae for [z”]¥!1(z): 


w=2 w=3 w=4 w=5 w=6 


0.0085 - 2.1701” 0.0026 - 2.8050” 0.0012 - 3.1638” 0.0006 - 3.3829" 0.0004 - 3.5259” . 


Irreducibility is granted since all entries in the transfer matrix are non-zero. Aperiodicity derives 
from aperiodicity of the generator ¢, as verified by a simple argument (e.g., using Note V.43). 


Proposition V.10. The number of trees of width at most w ina simple family of trees satisfies 
an asymptotic estimate of the form 


yl = CoP” + O(n), 


for some computable positive constants Cw, Pw- 


In addition, the exact distribution of height in trees of size n becomes computable in poly- 
nomial time. 

The character of these generating functions has not been investigated in detail since the 
original work [463], so that, at the moment, complex analysis does not lead us any further. For- 
tunately, probability theory takes over. Chassaing and Marckert [111] have shown, for Cayley 
trees, that the width satisfies 


En(W) = [> S50) (n!/4 logn) , PyW2W <x) 3 1- O(a), 


where ©(x) is the Theta function defined in (67), p. 328. This answers very precisely an open 
question of Odlyzko and Wilf [463]. The distributional results of [111] extend to trees in any 
simple variety (under mild and natural analytic assumptions on the generator ¢): see the paper 
by Chassaing, Marckert, and Yor [112], which builds upon earlier results of Drmota and Gitten- 
berger [173]. In essence, the conclusion of these works is that the breadth first search traversal 
of a large tree in a simple variety gives rise to a queue whose size fluctuates asymptotically 
like a Brownian excursion, and is thus, in a strong sense, of a complexity comparable to depth- 
first search: trees taken uniformly don’t have much of a preference as to the way they may be 
traVETSED: fii aot ek a bantu seehd okie 44 dvRale ahha bas dad ass Peeratete Maes pewheee ed | 


> V.45. A question on width polynomials. It is unknown whether the following assertion is 
true. The smallest positive root p; of the denominator of Y Ik] (z) satisfies 


Cc = 
P= pt tolk »); 


for some c > 0. If such an estimate were established, together with suitable companion bounds, 
it would yield a purely analytic proof of the fact that the expected width of n-trees is O(./n), 
as well as detailed probability estimates. (The classical theory of Fredholm equations may be 
useful in this context.) <i 


Example V.18. Monomer-dimer tilings of a rectangle. Suppose one is given pieces that may 
be one of the three forms: monomers (m) that are 1 x 1 squares, and dimers that are dominoes, 
either vertically (v) oriented 1 x 2, or horizontally (/) oriented 2 x 1. In how many ways can 
ann x 3 rectangle be covered completely and without overlap (‘tiled’) by such pieces? 
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The pieces are thus of the following types, 


m=l4, n=l : Pei 


and here is a particular tiling of a5 x 3 rectangle: 


In order to approach this counting problem, one first defines a suitable collection, generi- 
cally denoted by C, of combinatorial classes called configurations, in accordance with the strat- 
egy summarized in Figure V.20, p. 356. A configuration relative to an n x k rectangle is a partial 
tiling, such that all the first n — 1 columns are entirely covered by dominoes while between zero 
and three unit cells of the last column are covered. Here are for instance, configurations corre- 
sponding to the example above. 


These diagrams suggest the way configurations can be built by successive addition of 
dominoes. Starting with the empty rectangle 0 x 3, one adds at each stage a collection of 
at most three dominoes in such a way that there is no overlap. This creates a configuration 
where, like in the example above, the dominoes may not be aligned in a flush-right manner. 
Continue to add successively dominoes whose left border is at abscissa 1, 2,3, etc, in a way 
that creates no internal “holes”. 

Depending on the state of filling of their last column, configuration can thus be classified 
into 8 classes that we may index in binary as Cogo,...,C 11. For instance Cog, represent 
configurations such that the first two cells (from top to bottom, by convention) are free, while 
the third one is occupied. Then, a set of rules describes the new type of configuration obtained, 
when the sweep line is moved one position to the right and dominoes are added. For instance, 


we have 
Co10 O = Cio1- 


In this way, one can set up a system of linear equations (resembling a grammar or a de- 
terministic finite automaton) that expresses all the possible constructions of longer rectangles 
from shorter ones according to the last layer added. The system contains equations like 


Cooo = e+ mmmCoo9 + mvCgoo + vmCgoo 
+ -mmC 99 + m-mCoj9 + mm-Cog1 + v-Coo1 + LC 100 
+ m-Coyy + -m-Cio, + -mCyj90 + Crit - 


Here, a “letter” like mv represent the addition of dominoes, in top to bottom order, of types 
m, v, respectively; the letter m-m means adding two m-dominoes on the top and on the bottom, 
etc. 
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The system transforms into a linear system of equations with polynomial coefficients, upon 
performing the substitutions 


mr Z, hy 2, vp 2. 


Solving it gives the generating functions of configurations with z marking the area covered: 
(1 — 2z3 — 25)(1 +. z3 — 24) 
(1+ 23)(1 — 523 — 926 4 929 4 212 — 215)’ 


Co00(z) = 


In particular, the coefficient [<3”1Cooo (z) is the number of tilings of ann x 3 rectangle: 
Cooo(z) = 1 + 3z3 + 222° + 1312? + 823z!2 + 5096z!5 4... . 


The sequence grows like ca” (for n = 0 (mod 3)) where a = 1.83828 (a is the cube root of 
an algebraic number of degree 5). (See [109] for a computer algebra session.) On average, for 
large n, there is a fixed proportion of monomers and the distribution of monomers in a random 
tiling of a large rectangle is asymptotically normally distributed, a result that follows from the 
developments of Section IX. 6, p. 650. 2.0.0... ccc ccc eee enn n ene e nes | 


The tiling example is a typical illustration of the transfer matrix method as de- 
scribed in Figure V.20, p. 356. One seeks to enumerate a “special” set of configura- 
tions: in the example above, this is Cogo representing complete rectangle coverings. 
One determines an extended set of configurations C (the partial coverings, in the ex- 
ample) such that: (7) C is partitioned into finitely many classes; (ii) there is a finite set 
of “actions” that operate on the classes; (iii) size is affected in a well-defined additive 
way by the actions. The similarity with finite automata is apparent: classes play the 
r6le of states and actions the rdle of letters. 

Often, the method of transfer matrices is used to approximate a hard combinato- 
rial problem that is not known to decompose, the approximation being by means of a 
family of models of increasing “widths”. For instance, the enumeration of the number 
T,, of tilings of ann x n square by monomers and dimers remains a famous unsolved 
problem of statistical physics. Here, transfer matrix methods may be used to solve the 
n X w version of the monomer-—dimer coverings, in principle at least, for any fixed 
width w: the result will always be a rational function, although its degree, dictated by 
the dimension of the transfer matrix, will grow exponentially with w. (The “diagonal” 
sequence of the n x w rectangular models corresponds to the square model.) It has 
been at least determined by computer search that the diagonal sequence T;, starts as 
(this is ETS A028420): 


1, 7, 131, 10012, 2810694, 2989126727, 11945257052321,.... 


From this and other numerical data, one estimates numerically that (7;,)!/ ”” tends to 
a constant, 1.94021 ..., for which no expression is known to exist. The difficulty of 
coping with the finite-width models is that their complexity (as measured, e.g., by the 
number of states) increases exponentially with w—such models are best treated by 
computer algebra; see [627]—but no law allowing to take a diagonal is visible. At 
least, the finite-width models have the merit of providing provable upper and lower 
bounds on the exponential growth rate of the hard “diagonal problem’. 

In contrast, for coverings by dimers only, a strong algebraic structure is available 
and the number of covers of ann x n square by horizontal and vertical dimers satisfies 
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a beautiful formula originally discovered by Kasteleyn (n even): 
n/2 n/2 


_ pnt? 9 ju 4 kx 
(111) U, = 2 LIT (oe cacao are 


This sequence is EJS A004003, 
1, 2, 36, 6728, 12988816, 258584046368, 53060477521960000, ... . 


It is elementary to prove from (111) that 


ea eet ee DE ee @ 
tim | (Un) =a0(2 > = eF/™ = 133851..., 


where G is Catalan’s constant. This means in substance that each cell has a number 
of degrees of freedom equivalent to 1.33851. See Percus’ monograph [477] for proofs 
of this famous result and Finch’s book [211, Sec. 5.23] for context and references. 

> V.46. Powers of Fibonacci numbers. Consider the OGFs 


1 
6@ = ——5 => Fa”, GH = DY (Fat) * 2", 


n>0 n>0 


where F,, is a Fibonacci number. The OGF of monomer-dimer placements on a k x n board 


when only monomers (m) and horizontal dimers (h) are allowed is obviously G!*1(z). On the 
other hand, it is possible to set up a transfer matrix model with state i (0 < i < k) corresponding 
to i positions of the current column occupied by a previous domino. Consequently, 


Gl(z) = coeff, x (u - zT)"') ; where Tj,j = (, re m o: 


for 0 < i, j < k. (The denominator of Gi (z) is known exactly: see [377, Ex. 1.2.8.30].) << 


> V.47. Tours on chessboards. The OGF of Hamiltonian tours on an n x w rectangle is rational 
(one is allowed to move from any cell to any other vertically or horizontally adjacent cell). The 
same holds for king’s tours and knight’s tours. dq 


> V.48. Cover time of graphs. Given a fixed digraph [ assumed to be strongly connected, 
and a designated start vertex, one travels at random, moving at each time to any neighbour 
of the current vertex, making choices with equal likelihood. The expectation of the time to 
visit all the vertices is a rational number that is effectively (though perhaps not efficiently!) 
computable. [Hint: set up a transfer matrix, a state of which is a subset of vertices representing 
those vertices that have been already visited. For an interval [0, ..m], this can be treated by the 
dedicated theory of walks on the integer interval, as in Section V. 4; for the complete graph, this 
is equivalent to the coupon collector problem. Most other cases are “hard” to solve analytically 
and one has to turn to probabilistic approximations; see Aldous and Fill’s forthcoming book [11] 
for a probabilistic approach.] 


Example V.19. Self-avoiding walks and polygons. A long-standing open problem, shared by 
statistical physics, combinatorics, and probability theory alike, is that of quantifying properties 
of self-avoiding configurations on the square lattice (Figure V.22). Here we consider objects 
that, starting from the origin (the “root’’), follow a path, and are solely composed of horizontal 
and vertical steps of amplitude +1. The self-avoiding walk or SAW can wander but is subject to 
the condition that it never crosses nor touches itself. The self-avoiding polygons or SAPs, whose 
class is denoted by P, are self-avoiding walks, with only an exception at the end, where the end- 
point must coincide with the origin. We shall focus here on polygons. It proves convenient also 
to consider unrooted polygons (also called simply-connected polyominoes), which are polygons 
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Hel Ea 


Figure V.22. A self-avoiding polygon or SAP (left) and a self-avoiding walk or SAW (right). 


in which the origin is discarded, so that they plainly represent the possible shapes of SAPs up to 
translation. For length 2n, the number pn of unrooted polygons satisfies pn = Pn/(4n) since 
the origin (2n possibilities) and the starting vertex (2 possibilities) of the corresponding SAPs 
are disregarded in that case. Here is a table, for small values of n, listing polyominoes and the 
corresponding counting sequences pn, Pn. 


— fo) foe 
= = uC lial 


n: 2 33 4 5 6 7 8 9 10 
Pn (EIS A002931): 1 2 7 28 124 588 2938 15268 81826 
Py (EIS A010566): 8 24 112 560 2976 16464 94016 549648 3273040 


Take the (widely open) problem of determining exactly the number P, of SAPs of peri- 
meter 2n. This (intractable) problem can be approached as a limit of the (tractable) problem!2 


that consists in enumerating the collection P!”! of SAPs of width w, for increasing values of w. 
The latter problem is amenable to the transfer matrix method, as first discovered by Enting in 
1980; see [192]. Indeed, take a polygon and consider a vertical sweepline, that moves from 
left to right. Once width is fixed, there are at most geese possibilities for the ways such a line 
may intersect the polygon’s edges at half integer abscissae. (There are w + 1 edges and for 
each of these, one should “remember” whether they connect with the upper or lower boundary.) 
The transitions are then themselves finitely described. In this way, it becomes possible to set 
up a transfer matrix for any fixed width w. For fixed n, by computing values of piv with 
increasing w, one finally determines (in principle) the exact value of any Py. 

The program suggested above has been carried out to record values by the “Melbourne 
School” under the impulse of Tony Guttmann. For instance, Jensen [356] found in 2003 that 
the number of unrooted polygons of perimeter 100 is 


P50 = 7545649677448506970646886033356862162. 


!2We limit ourselves here to a succinct description and refer to the original papers [192, 356] for 
details. 
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Attaining such record values necessitates algorithms that are much more sophisticated than the 
naive approach we have just described, as well as a number of highly ingenious programming 
optimizations. 

It is an equally open problem to estimate asymptotically the number of SAPs of peri- 
meter n. Given the exact values up to perimeter 100 or more, a battery of fitting tests for 
asymptotic formulae can be applied, leading to highly convincing (though still heuristic) for- 
mulae. Thanks to several workers in this area, we can regard the final answer as “known”. From 
the works of Jensen and his predecessors, it results that a reliable empirical estimate is of the 
form 


Pn = Bu?" 2n)-4 (1 + o(1)), 
5 
= 2.63815 85303, p= — +3. 1077, B = 0.5623013. 


Thus, the answer is almost certainly of the form py = wehn-/ 2 for unrooted polygons and 
Phx wn-3/ 2 for rooted polygons. It is believed that the same connective constant ju dictates 
the exponential growth rate of self-avoiding walks. See Finch’s book [211, Sec. 5.10] for a 
perspective and numerous references. 

There is also great interest in the number pm,n of polyominoes with perimeter 2n and 
area m, with area defined as the number of square cells composing the polyomino. Studies 
conducted by the Melbourne school yield numerical data that are consistent to an amazing 
degree (e.g., moments up to order ten and small-n corrections are considered) with the following 
assumption: The distribution of area in a fixed-perimeter polyomino obeys in the asymptotic 
limit an “Airy area distribution”. This distribution is defined as the limit distribution of the 
area under Dyck paths, a problem that was introduced on p. 330 and to which we propose to 
return in Chapter VII (p. 535) and IX (p. 706). See [356, 509, 510] and references therein for a 
specific discussion of polyomino area. It is finally of great interest to note that the interpretation 
of data was strongly guided by what is already known for exactly solvable models of the type 
we are repeatedly considering in this book. .......... 0... cece eee cece nee eee eee eee | 


Example V.20._— Horizontally convex polyominoes. Pélya [490] and Temperley [574] inde- 
pendently discovered an exactly solvable polyomino model. (See also the text by van Rens- 
burg [592] for more.) Define as usual a polyomino as a collection of unit squares with vertices 
in Z>9 x Zo that forms a connected set without articulation points. Such a polyomino is said 
to be horizontally convex (H.C.) if its intersection with any horizontal line is either empty or 
an interval. An H.C. polyomino is thus a stack of a certain number of rows of squares, where 
each row has a segment of length > 1 in common with the next row up. (We imagine H.C. 
polyominoes growing from bottom to top.) The enumeration of such polyominoes, following 
Temperley [574, p. 66] constitutes a nice extension of the transfer matrix method in the case 
when the set of states is infinite. 

Let T!] be the class of polyominoes with exactly k square cells on their top row. The size 
of a polyomino is its number of cells. We wish to enumerate the class T := UJ; TA], In order 
to do so, according to the transfer matrix method, one needs to relate the TI] to one another. 
Let z be the variable marking size. The transition from one TH] to a THI has a multiplicity 
equal to k + € — 1. Thus the generating functions t, := vad (z) satisfy an infinite system of 
equations, which starts as 


yo = 242(t) +24+384+---) 
(112) ty 22 + 2" (2t + 3t2 +413 4+---) 
BR = 24264 +4m4+5R4+-:-). 
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ee 


Figure V.23. Five horizontally convex polyominoes of size n = 50 drawn uniformly 
at random. 


This corresponds to an infinite transfer matrix which is highly structured: 
M(2)ne = (K+ E— Ve", 


and, as shown by Temperley [574, p. 66], the system can be solved by elementary manipula- 
tions. We shall however prefer to take another route, more in line with the spirit of this book, 
In a case like this, it is well worth trying a bivariate generating function. Define 


Tu) = DIT uk. 
n,k 
The action of “adding a slice” on the top row of a polyomino is reflected by a linear operator 
C that transforms u*, representing the top row of the polyomino before addition, into a sum of 
monomials uéz“, with the proper multiplicities: 
uz uz 
Liu] = kuz) + (k + D(uz)kt! +... = (k= 1) “ 5: 
l1-—uz (1—uz) 
(An earlier instance of the technique of “adding a slice” appears in the context of constrained 


compositions, Example III.22, p. 199.) A better formula results if one expresses more generally 
the quantity L[ f (u)]: 


uz u 
(113) LFW = GS + 


Treat now the BGF T (z, u) as a function of u, keeping z as a parameter, and write for readability 
t(u) := T(z,u). A horizontally convex polyomino is obtained by starting from a bottom row 
that can have any number of cells and repeatedly adding a slice. This construction is thus 
reflected by the main functional equation 

Zu 


Zz 


= (1) - £0). 


ru) = 7 tLe 
(114) zi Py: ; 2ay2 
eg ie OG 


upon making use of (113). Instantiating at u = 1 provides the first relation 
2 


v4 
(1 — z)? 


t(1), 


(115) 7(1) = oR 


ake! 
et a 
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while differentiation of (114) with respect to u followed by the specialization u = 1 provides 
the second relation 


(116) 7/(1) = z 


Zz Zz , 
+ t (1) +2——, r (1). 
dae” Aas (=z)? 
We now have a linear system of two equations in two unknowns, resulting in an expression 
of 7(1) = T(z) = T(, 1), which enumerates all horizontally convex polyominoes: 


zd - z)3 
1—5z24+722 — 423° 
(From (114) to (117), the whole calculation is barely three lines of code under a decent computer 
algebra system.) Note that, the original system being infinite, it is far from obvious a priori that 
the generating function should be rational—in the present context, rationality devolves from the 
highly structured character of the transfer matrix. 

The counting sequence obtained by expansion, 


T(z) =z +227 +624 1924 +61 2° + 196 2° + 62927 +2017 28 +... 


is EIS A001169 (“Number of board-pile polyominoes with n cells”). The asymptotic form is 
then easily obtained: we find 


Tn ~ CA", C =0.18091, A =3.20556, 


(117) T(z) = 


with A a cubic irrational. 

An alternative derivation, which is more sophisticated, is due to Klarner and is presented in 
Stanley’s book [552, §4.7]. Hickerson [333] has found a direct construction, which explains the 
rationality of the GF by means of a regular language encoding. (The drawings of Figure V.23 
have been obtained by an application of the recursive method [264] to Hickerson’s specifica- 
tion.) Louchard [420] has conducted an in-depth study of probabilistic properties of several 
parameters of H.C. polyominoes, using generating functions. .................0....000 0. | 


> V.49. Height of H.C. polyominoes. Upon introducing an extra variable v to encode height, 
one finds that height grows on average linearly with n and the variance is O(n), so that the 
distribution is concentrated [420]. (This explains the skinny aspects of polyominoes drawn in 
Figure V.23.) dq 


> V.50. A transfer matrix model for lattice paths. Consider the general context of weighted 
lattice paths in Section V.4. Let a;, Bj, 7; be the weights of ascents, descents, and level steps, 
respectively, when the starting altitude is 7. The infinite transfer matrix, 
Yo &0 0 0 0 
Bie Bi, DOO ess 
T=] 0 fo yo a2 O :-- |, 


which has a tridiagonal form, “generates” all lattice paths via the quasi-inverse (J — 2T)7!. 
In particular, any exactly solvable weighted lattice path model is equivalent to an explicit struc- 
tured matrix inversion. dq 


V.6.4. Value-constrained permutations. We conclude this chapter with a dis- 
cussion of a construction that combines transfer matrix methods with an inclusion— 
exclusion argument. We treat a collection of constrained permutation problems whose 
origin lies in nineteenth century recreational mathematics. For instance, the ménage 
problem solved and popularized by Edouard Lucas in 1891, see [129], has the fol- 
lowing quaint formulation: What is the number of possible ways one can arrange n 
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married couples (“ménages”) around a table in such a way that men and women al- 
ternate, but no woman sits next to her husband? 
The ménage problem is equivalent to a permutation enumeration problem. Sit 


first conventionally the men at places numbered 1, 2, ..., and the wives at positions 
3, 3, wey Nt 7 Let o; be such that the ith wife is placed at oj + 7 Then, a ménage 


placement imposes the conditions o; 4 i and o; 4 i — | for each i. We consider here 
a linearly arranged table (see remarks at the end for the other classical formulation 
that considers a round table), so that the condition o; 4 i — 1 becomes vacuous when 
i = 1. Here is a ménage placement for n = 6 and its corresponding permutation 


|. 2 3 4°53. 6 
eed Ue ons es 
This is a generalization of the derangement problem (for which only the weaker con- 


dition o; #¢ i is imposed and the cycle decomposition of permutations suffices to 
provide a direct solution; see Example II. 14, p. 122). 


Definition V.9. Given a permutation o = 0,-++ On, any quantity 0; — i is called an 
exceedance of o. Given a finite set of integers Q C Zso, a permutation is said to be 
Q-avoiding if none of its exceedances lies in Q. 


The original ménage problem is modelled by Q = {—1, 0}, or, up to a simple trans- 
formation, by Q = {0, 1}. 


Inclusion—exclusion. The set Q being fixed, consider first for all j the class 
of augmented permutations P,,,; that are permutations of size n such that j of the 
positions are distinguished and the corresponding exceedances lie in Q, the remaining 
positions having arbitrary values (but with the permutation property being satisfied). 
Loosely speaking, the objects in P,,,; can be regarded as permutations with “at least” 
j exceedances in Q. For instance, with Q = {1} and 


_f12 345 67 8 9 
ON DB. A867 Ml SOD 
there are 5 exceedances that lie in Q (at positions 1,2, 3,5, 6) and with 3 of these 


distinguished (say by enclosing them in a box), one obtains an element counted by 
Po,3, Such as 


2/3 1/4/86/7/159. 


Let P,,; be the cardinality of P,,;. We claim that the number Q, = QO of Q- 
avoiding permutations of size n satisfies 


(118) On = > (-1)! Pr,j. 


j=0 
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an 


Figure V.24. A graphical rendering of the legal template 20?02?11? relative to Q = {0, 1, 2}. 


Equation (118) is typically an inclusion—exclusion relation. To prove it formally!3, 
define the number R,, ; of permutations that have exactly k exceedances in Q and the 
generating polynomials 


P,(w) = >> Py,jw!, Ry(w) =} Ran 
j k 


The GFs are related by 
P,(w) = Ry(w + 1) or Ry(w) = Py(w — 1).. 


(The relation P,(w) = R,(w + 1) simply expresses symbolically the fact that each 
Q-exceedance in R may or may not be taken in when composing an element of P.) 
In particular, we have P,(—1) = Ry(0) = Rn,o = Qn as was to be proved. 


Transfer matrix model. The preceding discussion shows that everything relies on 
the enumeration P,,, ; of permutations with distinguished exceedances in Q. Introduce 
the alphabet A = Q U {‘?’}, where the symbol ‘?’ is called the ‘don’t-care symbol’. 
A word on A, an instance with QO = {0, 1, 2} being 207027117, is called a template. 
To an augmented permutation, one associates a template as follows: each exceedance 
that is not distinguished is represented by a don’t care symbol; each distinguished 
exceedance (thereby an exceedance with value in Q) is represented by its value. A 
template is said to be legal if it arises from an augmented permutation. For instance a 
template 2 1 --- cannot be legal since the corresponding constraints, namely 0, — 1 = 
2, 02 — 2 = 1, are incompatible with the permutation structure (one would have 
0, = 02 = 3). In contrast, the template 20702711? is seen to be legal. Figure V.24 is 
a graphical rendering; there, letters of templates are represented by dominoes, with a 
cross at the position of a numeric value in Q, and with the domino being blank in the 
case of a don’t-care symbol. 

Let T;,, ; be the set of legal templates relative to Q that have length n and comprise 
j don’t care symbols. Any such legal template is associated to exactly j! permutations, 
since n — j position-value pairs are fixed in the permutation, while the j remaining 


13S¢e also the discussion in Subsection III. 7.4, p. 206. 
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positions and values can be taken arbitrarily. There results that 
n 

(119) Pega iihg ind Ov= S317! Tj 
j=0 


by (118). Thus, the enumeration of avoiding permutations rests entirely on the enu- 
meration of legal templates. 

The enumeration of legal templates is finally effected by means of a transfer ma- 
trix method, or equivalently, by a finite automaton. If a template t = 7 -- - ty is legal, 
then the following condition is met, 


(120) t+ jA#uti, 


for all pairs (i, 7) such that i < j and neither of 7;, 7; is the don’t-care symbol. (There 
are additional conditions to characterize templates fully, but these only concern a few 
letters at the end of templates and we may ignore them in this discussion.) In other 
words, a t; with a numerical value preempts the value 7; + i. Figure V.24 exempli- 
fies the situation in the case Q = {0, 1,2}. The dominoes are shifted one position 
each time (since it is the value of o — i that is represented) and the compatibility con- 
straint (120) is that no two crosses should be vertically aligned. More precisely the 
constraints (120) are recognized by a deterministic finite automaton whose states are 
indexed by subsets of {0,..., b— 1} where the “span” b is defined as b = maxgeg o. 
The initial state is the one associated with the empty set (no constraint is present ini- 
tially), the transitions are of the form (j € {0,..., D}): 


(qs, j) > qs: where S’ = ((S—1)U{j —1})N{0,...,b-}} 
(qs,?) t+) qs: where S’= (S—1)N{0,...,b— 1}. 


The initial state (is gy and it is equal to the final state (this translates the fact that 
no domino can protrude from the right, and is implied by the linear character of the 
ménage problem under consideration). In essence, the automaton only needs a finite 
memory since the dominoes slide along the diagonal and, accordingly, constraints 
older than the span can be forgotten. Notice that the complexity of the automaton, as 
measured by its number of states, is 2”. 

Here are the automata corresponding to Q = {0} (derangements) and to Q = 


{0, 1} (ménages). 
|| 
O) _E 


For the ménage problem, there are two states depending on whether or not the cur- 
rently examined value has been preempted at the preceding step. 

From the automaton construction, the bivariate GF T(z, w) of legal templates, 
with u marking the position of don’t care symbols, is a rational function that can 
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be determined in an automatic fashion from ©. For the derangement and ménage 
problems, one finds 

1 1—z 
1—zd+u)’ 1—z(2+u)+22° 
In general, this gives access to the OGF of the corresponding permutations. Indeed, 
the OGF of Q—avoiding permutations is obtained from T by a transformation akin 
to the Laplace transform: we have 


TO, u) = TONG, u) = 


(121) z"ui WH (—z)"(-1)/j!, so that 0%) = fo "T(z, -w au, 
0 


which transcribes (119) and constitutes a first closed-form solution. In addition, con- 
sider the partial expansion of T(z, w) with respect to u, taken as 


cr) 
122 T°(z,u) = 5 —"*_., 
ge?) a pe 1 — uu,(z) 
assuming for simplicity only simple poles. There, the sum is finite and it involves 
algebraic functions c, and u, of the variable z. Define next the (divergent) OGF of all 


permutations, 
CO 


F(Qy) = >oaly" =2Fll, 1; yl, 
n=0 
in the terminology of hypergeometric functions (Note B.15, p. 751). Then, by (121) 
and (122), we find 


(123) 2°) = > cr(—2)F (—u j(-2)). 


In other words: the OGF of Q-avoiding permutations is expressible both as the 
Laplace transform of a bivariate rational function (121) and as a composition (123) 
of the OGF of the factorial series with algebraic functions. 

The expressions (122) simplify much in the case of ménages and derangements 
where the denominators of T are of degree | in u. One finds 


Q(z) = 


F( J Le 2084 ct bbc + 26528 4185407 oe 
14+2z 1+z 


for derangements, whence a new derivation of the known formula, 


O_S eye (eg oe 
Q See 1) (‘Jo ky}. 


Similarly, for (linear) ménage placements, one finds 


1 Zz 

{0,1} 7.) _ = 3 4 5 6 7 

OQ Zg= F =1l+27°4+32'4+ 16z? 4+ 96z? + 675z' +-:-, 
© Lae (iss) 


which is EJS A000027 and corresponds to the formula 


0.1) — Sek ) a 
os = 2 v*( , en! 
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Finally, the same techniques adapts to constraints that “wrap around”, that is, con- 
straints taken modulo n. (This corresponds to a round table in the ménage problem.) 
In that case, what should be considered is the circuits in the automaton recognizing 
templates (see also the discussion on p. 337). One obtains in this way the OGF of the 
circular (i.e., classical) ménage problem (EIS 4000179), 


a 1l-z 4 

{0,1} = 3 4 5 6 7 

QO zZa= F +27 = 1+z4+27°+22°+13z°+80z°+579z'+---, 
e bere (; em) 


which yields the classical solution of the (circular) ménage problem, 


us 7 2n (2n—k 
OM = > Ee * Jon. 
a 2n —k k 


This last formula is due to Touchard; see [129, p. 185] for pointers to the vast classical 
literature on the subject. The algebraic part of the treatment above is close to the 
inspiring discussion found in Stanley’s book [552]. An application to robustness of 
interconnections in random graphs is presented in [239]. 


Asymptotic analysis. For asymptotic analysis purposes, the following property 
proves useful. Let F be the OGF of factorial numbers and assume that y(z) is analytic 
at the origin where it satisfies y(z) = z — Az* + O(z3); then the following estimate 
holds: 


(124) [z"]Fv@) ~ [e"1F@C — Az) ~ nle™. 
(The proof results from simple manipulations of divergent series in the style of [36, 


§5].) This gives at sight the estimates 


Q} ~ nie, Qt ~ nie. 


Generally, one has: 
Proposition V.11. For any set Q containing 1 elements, the number of permutations 
without exceedances in Q satisfies 


Qi? ~nle?, 


Furthermore, the number Ry of permutations having exactly k occurrences (k fixed) 
of an exceedance in Q is asymptotic to 
k 
aga 
kt 
That is, the rare event that an exceedance belongs to Q is asymptotically governed by 
a Poisson distribution of rate 1 = |Q\|. 


{Q} 
Rik 


~ nile 


This statement is established by means of elementary combinatorial manipula- 
tions in Bender’s survey [36, $4.2] and by probabilistic techniques in the book of Bar- 
bour, Holst, and Janson [29, Sec. 4.3]. The relation (124) provides a way of arriving 
at such estimates by purely analytic-combinatorial techniques. 


> V.51. Other constrained permutations. Given a permutation 0 = oj -- +0, a Succession gap 
is defined as any difference oj, — oj. 
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In how many ways can a kangaroo jump through all points of the integer interval [1, n+ 1] 
starting at 1 and ending at n + 1, while making hops that are restricted to {—1, 1, 2}? (The OGF 


is the rational function 1/(1 — z — z3) corresponding to EIS 4000930.) 
The number R,, of permutations of size n, such that o;41 — 0; 4 1 has OGF F(z/(1+2z)), 


the coefficients being E7S 4000255, with asymptotics Ry ~ nte—!. The number S;, of those, 
such that |oj4 1 —o;| 4 1 has OGF F(z(1—z)/(1+z)). Proof (for S,): Use inclusion-exclusion 
based on configurations with distinguished sequences of +1 successions, like 


— => — — => — 


876] 1015)/234/59113}1211)/14 3 004/610)/20e0«)/3518\e7]9, 
which leads to the OGF 


224 m 
S mi(e+ 
1-—zu 


m>0 


ll 
= 
—s 
N 
+} | 
NTN 
See 
= 


u=—-1 


= 14742244 142° +9025 4 64627 4---; 


cf EIS A002464 and [4]; this is the number of placements of n kings on a chessboard, one 
per line, one per column, and in non-attacking position. Asymptotically, one has S, ~ n!e72, 
see [572], in accordance with (124). In general, what about the counting of permutations whose 


succession gaps are constrained to lie outside of a finite set Q? dq 


> V.52. Superménage numbers. Let T, be the number of permutations of size n such that 
(o;41 — oj) ¢ {0, 1, 2}. The OGF is 


1 1— 
TQ) = 5 (-2+ F ( al) 5) ) = bt et 4 505 +335 4 23607 4 
1-z d+zd+z- 2°) 
see [222] and EJS A001887. Asymptotically: T, ~ nle~>, dq 


V.7. Perspective 


The theorems in this chapter demonstrate the power of the fundamental tech- 
niques developed in Chapter IV, which exploit classical theorems in complex analysis 
to develop coefficient asymptotics. As we start seeing it here, this approach applies 
to many of the generating functions derived from the formal combinatorial techniques 
of Part A of this book. By paying careful attention to the types of combinatorial con- 
structions involved, we are able to identify abstract schemas that help us solve whole 
classes of problems at once. Each schema connects a type of combinatorial construc- 
tion to a complex asymptotic method. In this way, it becomes possible to discuss 
properties shared by an infinite collection of combinatorial classes. In this chapter, 
we have presented the method in detail for classes that involve a sequence construc- 
tion and classes recursively defined by a linear system of equations (paths in graphs, 
automata, transfer matrices). 

In an ideal world, we might wish to have a direct correspondence between com- 
binatorial constructions and analytic methods—a theory that would carry all the way 
from combinatorial objects of any description to full analysis of all their properties. 
The case of paths in graphs and automata, with its strong connectedness condition 
leading to Perron—Frobenius theory, is an instance of this ideal situation. Reality is 
however usually a bit more complex: theorems for deriving asymptotic results from 
combinatorial specifications must often have some sort of analytic side conditions. A 
typical example is the radius of convergence condition for supercritical sequences. As 
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soon as such side conditions are satisfied, the asymptotic properties of large structures 
become highly predictable. This is the very essence of analytic combinatorics. 

In the next two chapters, we investigate generating functions whose singularities 
are no longer poles—fractional exponents and logarithmic factors become allowed. 
This first necessitates investing in general methodology, a task undertaken in Chap- 
ter VI where the method known as singularity analysis is developed. Then, a chapter 
parallel to the present one, Chapter VII, will present a number of new schemas based 
on the set and cycle constructions, as well as on recursion. 


Bibliographic notes. Applications of rational functions in discrete and continuous mathemat- 
ics are in abundance. Many examples are to be found in Goulden and Jackson’s book [303]. 
Stanley [552] even devotes a full chapter of his book Enumerative Combinatorics, vol. I, to 
rational generating functions. These two books push the theory further than we can do here, 
but the corresponding asymptotic aspects which we develop lie outside of their scope. The 
analytic theory of positive rational functions starts with the works of Perron and Frobenius at 
the beginning of the twentieth century and is explained in books on matrix theory likes those 
of Bellman [34] and Gantmacher [276]. Its importance has been long recognized in the theory 
of finite Markov chains, so that the basic theory of positive matrices is well developed in many 
elementary treatises on probability theory. For such aspects, we refer for instance to the classic 
presentations by Feller [205] or Karlin and Taylor [363]. 

The supercritical sequence schema is the first in a list of abstract schemas that neatly exem- 
plify the interplay between combinatorial, analytic, and probabilistic properties of large random 
structures. The origins of this approach are to be traced to early works of Bender [35, 36] fol- 
lowed by Soria and Flajolet [258, 260, 547]. 

Turning to more specific topics, we mention in relation to Section V.4 the first global at- 
tempt at a combinatorial theory of continued fractions by Flajolet in [214] together with related 
works of Jackson of which an exposition is to be found in [303, Ch. 5] and a synthesis in [238], 
in relation to birth and death processes. Walks on graphs from an algebraic standpoint are well 
discussed in Godsil’s book [295]; for infinite graphs and groups, see Woess [613]. The discus- 
sion of local constraints in permutations based on [239] combines some of the combinatorial 
elements bound in Stanley’s book [552] with the general philosophy of analytic combinatorics. 
Our treatment of words and languages largely draws its inspiration from the line of research 
started by Schiitzenberger in the early 1960s and on the subsequent account to be found in 
Lothaire’s book [413]. A nice review of transfer matrix methods (including a discussion of 
limit distributions) is offered by Bender, Richmond, and Williamson in [46]. 


Applied mathematics is bad mathematics. 


— PAUL HALMos [317] 


Good applied mathematics is like the unicorn: 
something we can all recognize but seldom actually see. 


— DAVID ALDOUS 
(in Statistical Science, Vol. 5, No. 4 (Nov., 1990), pp. 446-447) 
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Singularity Analysis of Generating 
Functions 


Es ist eine Tatsache, daB die genauere Kenntnis 
des Verhaltens einer analytischen Funktion 

in der Nahe ihrer singularen Stellen 

eine Quelle von arithmetischen Sdtzen ist! 


— ERICH HECKE [326, Kap. VIII] 
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A function’s singularities are reflected in the function’s coefficients. Chapters IV 
and V have treated in detail rational fractions and meromorphic functions, where the 
local analysis of polar singularities provides contributions to coefficients in the form of 
exponential—polynomials (products of polynomials and exponentials). In this chapter, 
we present a general approach to the analysis of coefficients of generating functions 
that is not restricted to polar singularities and extends to a large class of functions that 
have moderate growth or decay at their dominant singularities. It includes a number 
of functions coming from combinatorial constructions of Part A. The basic principle 
behind the extension is the existence of a general correspondence between 


the asymptotic expansion of a function near its dominant singularities 
and 
the asymptotic expansion of the function’s coefficients. 
This mapping preserves orders of growth in the sense that larger functions tend to 
have have larger coefficients. It extends considerably the analysis of meromorphic 
functions in Chapters [V—V and further justifies the Principles of Coefficient Asymp- 
totics enunciated in Chapter IV, p. 227. 


len is a fact that the precise knowledge of the behaviour of an analytic function in the vicinity of its 
singular points is a source of arithmetic properties.” 
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Precisely, the method of singularity analysis applies to functions whose singular 
expansion involves fractional powers and logarithms—one sometimes refers to such 
singularities as “algebraic—logarithmic’. It centrally relies on two ingredients. 


(i) A catalogue of asymptotic expansions for coefficients of the standard func- 
tions that occur in such singular expansions. 

(ii) Transfer theorems, which allow us to extract the asymptotic order of coeffi- 
cients of error terms in singular expansions. 


The developments are based on Cauchy’s coefficient formula, used in conjunction with 
special contours of integration known as Hankel contours. The contours come very 
close to the singularities then steer away: by design, they capture essential asymptotic 
information contained in the functions’ singularities. 

The method of singularity analysis is robust: functions amenable to it are closed 
under a variety of operations, including sum, product, integration, differentiation, and 
composition. Another important feature of the method is that it only necessitates local 
asymptotic properties of the function to be analysed. In this way, it often proves instru- 
mental in the case of functions that are only indirectly accessible through functional 
equations. 

This chapter is meant to develop the basic technology of singularity analysis and, 
like Chapter IV, it is largely of a methodological nature. We illustrate the approach 
with a few combinatorial problems, including simple varieties of trees (e.g, unary— 
binary trees), combinatorial sums, the supercritical cycle construction, supertrees, 
Pélya’s drunkard walks, and tree recurrences. The next chapter, Chapter VII, will sys- 
tematically explore combinatorial structures and schemas as well as functional equa- 
tions that can be asymptotically analysed by means of singularity analysis in a way 
that parallels the applications of rational and meromorphic asymptotics in Chapter V. 


VI.1. A glimpse of basic singularity analysis theory 


Rational and meromorphic functions involve, locally near a singularity ¢, ele- 
ments of the form (1 — z/¢)~”, with r € Zs ,. Accordingly their coefficients in- 
volve asymptotically exponential—polynomials, that is, finite linear combinations of 
elements of the type ¢~"n" || with r a positive integer. We examine here an ap- 
proach that takes into account functions whose singularities are of a richer nature than 
mere poles found in rational and meromorphic functions. Specifically, we consider 
functions whose expansion at a singularity ¢ involves elements of the form 


sey B 
oe log ! 
c 1-3 


Under suitable conditions to be discussed in detail in this chapter, any such element 
contributes a term of the form 


c"n*| (log ny. 


Here, a and f can be arbitrary complex numbers. 
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Location of singularities and exponential factors. The exponential factor ¢~” 
present in earlier expansions is easily accounted for, since the location of the domi- 
nant singularities always induces a multiplicative exponential factor for coefficients. 
Indeed, if f(z) is singular at z = ¢, then g(z) = f(z) satisfies, by the scaling rule of 
Taylor expansions, 


[21 fF @) =o" RFR) =o "Le" 18 @), 
where g(z) now has a singularity at z = 1. Consequently, in the discussion that 


follows, we shall examine functions that are singular at 1, a condition that entails no 
loss of generality. 


Basic scale. Consider the following table of commonly encountered functions 
that are singular at 1, together with their coefficients: 


Function coefficient (exact) coefficient (asympt.) 
2 (2n-—2 1 
1-v1— —— ~ 
ay ' ee ') Wand 
(fr) 1 1 (2n 1 
: 1 See 4” n /tn 
() 1 
(A) 7 oa 
pear’? 
1 1 
(fs) ——log——- Hn ~ logn 
1-z 1-z 
1 
(fs) n+1 ~ on 


(1 —z)? 


Some structure is apparent in this table: a logarithmic factor in the function is reflected 
by a similar factor in the coefficients, square-roots somehow induce square-roots, and 
functions involving larger powers do have larger coefficients. 

It is easy to come up at least with a partial explanation of these observations. 
Regarding basic functions such as f1, fo, 3, and fs, the Newton expansion 


a-gt= > ("F2> 


n=0 
when specialized to an integer a = r € Zs immediately gives the asymptotic form 
of the coefficients involved, 


: ap 2 MDG es eer dy n’—} 1 
Q kG-27%= a -“5 (1+0(-)). 


For general a, it is therefore natural to expect 


oe{ a-l1 1 
3) wage ("FS )- (1+0(;)). 


It turns out that this asymptotic formula remains valid for real or complex a, provided 
we interpret (a — 1)! suitably. We shall prove the estimate 


a-l aes | 
(4) MiG=2 ea (1494), 
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fs fa fh ; f5.n 


Figure VI.1. The five functions from Equation (1) and a plot of their coefficient 
sequences illustrate the tendency of coefficient extraction to be consistent with orders 
of growth of functions. 


where I'(a@) is the Euler Gamma function defined as 


(5) T(a):= [ et dt. 
0 


for R(a) > 0, which coincides with (a — 1)! whenever a is an integer. (Basic proper- 
ties of this function are recalled in Appendix B.3: Gamma function, p. 743.) 

We observe from the pair (2)—(3) that functions that are larger at the singularity 
z = | have indeed larger coefficients (see Figure VI.1). The correspondence that 
this observation suggests is general, as we are going to see repeatedly throughout this 
chapter. A catalogue of exact or asymptotic forms for coefficients of standard singular 
functions is obtained in Section VI. 2 (see Theorem VI.1, p. 381). 


Transfer of error terms. An asymptotic expansion of a function f(z) that is sin- 
gular at z = 1 is typically of the form 


(6) f(z) =o(z) + O(t(z)), where t(z) =o0(o(z))asz— 1, 
with o and t belonging to an asymptotic scale of standard functions such as the col- 


lection {(1 — z)~*}erR, in simpler cases. Taking formally Taylor coefficients in the 
expansion (6), we arrive at 


(7) fn = 2" fF @) = [2" lo (2) + [2"] O(c). 

The term [z”]o(z) is described asymptotically by (4). Therefore, in order to extract 
asymptotic informations on the coefficients of f(z), one needs a way of extracting 
coefficients of functions known only by their order of growth around the singularity. 
Such a translation of error terms from functions to coefficients is achieved by transfer 
theorems, which, under conditions of analytic continuation, guarantee that 


[z"]O(c(z)) = O([z”]z(z)); 


see Section VI.3 and Theorem VI.3, p. 390. This relation is much more profound than 
its symbolic form would seem to imply. 
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In summary, it is the goal of this chapter to expound the (favorable) conditions 
under which we have available the correspondence 


(8) f(@) = o(z) + O(z(@)) == fn = On + O(n), 


which defines the process known as singularity analysis: cf Section VI.4 and Theo- 
rem V1.4, p. 393. (This is seen to parallel the analysis of coefficients of rational and 
meromorphic functions presented in Chapters [V and V.) We develop the method for 
functions from the scale 


1 \4 

(1 -—z)“% (108 —) (z > 1), 
1—z 

whose coefficients have subexponential factors of the form 


n*—|(logn)?. 


(The range of singular behaviours taken into account by singularity analysis is even 
considerably larger: iterated logarithms (log log’s) as well as more exotic functions 
can be encapsulated in the method.) 


Example V1.1. — First asymptotics of 2—-regular graphs. As an illustration of the modus 
operandi of singularity analysis, consider the class R of labelled 2—regular graphs (Note II.22, 
p. 133): 


2 
R = SET(UCYC33(Z)) => R(z) = o0( 5 (vee 7) >). 


where UCYC is the undirected cycle construction. 
Singularity analysis permits us to reason as follows. The function 


en e/2-27/4 
V1-z 


is only singular at z = 1 where it has a branch point. Expanding the numerator around z = 1, 
we have 


R@= 


—3/4 


(9) RQ) = —— + O(1 - 2"). 


Me 


Therefore (see Theorems VI.1 and VI.3, as well as the discussion in Example VI.2 below, 
p. 395), upon translating formally term by term, one obtains 


7 i —3/4 


Furthermore, a full asymptotic expansion into descending powers of n can be derived in the 


; ae 
same way, from a complete expansion of the numerator e /2- 27/4 at za de eee a 
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Plan of this chapter. The first part of this chapter, Sections VI. 2—VI.5, is dedi- 
cated to the basic technology of singularity analysis along the lines of our foregoing 
discussion, and including the case of functions with finitely many singularities on the 
boundary of their disc of convergence. An “Intermezzo”, Section VI. 6, serves a pre- 
lude to the second part of the chapter, where we investigate operations on generating 
functions whose effect on singularities is predictable. The most important of these 
is inversion, which, under a broad set of conditions, leads to square-root singularity 
and provides a unified asymptotic theory of simple varieties of trees (Section VI. 7). 
Polylogarithms are proved to be amenable to singularity analysis in Section VI. 8, a 
fact that permits us to take into account weights such as ./n or logn in combinato- 
rial sums. Composition of functions is studied in Section VI.9. Then Section VI. 10 
presents several closure properties of functions of singularity analysis class, includ- 
ing differentiation, integration, and Hadamard product. The chapter concludes with a 
brief discussion of two classical alternatives to singularity analysis: Tauberian theory 
and Darboux’s method (Section VI. 11). 


VI.2. Coefficient asymptotics for the standard scale 


This section and the next two present the fundamentals of singularity analysis, a 
theory which was developed by Flajolet and Odlyzko in [248]. Technically the theory 
relies on a systematic use of Hankel contours in Cauchy coefficient integrals. Such 
Hankel contours classically serve to express the Gamma function: see Appendix B.3: 
Gamma function, p. 743. Here they are first used to estimate coefficients of a standard 
scale of functions, and then to prove transfer theorems for error terms (Section VI. 3). 
With this basic process, an asymptotic expansion of a function near a singularity is 
directly mapped to a matching asymptotic expansion of its coefficients. 

Starting from the binomial expansion, we have for general a, 


[27] seg) = co'(5") = ae = ‘) = a(a + 1)---(a tn—- 1) 


n n! 


This quantity is expressible in terms of Gamma factors, and 


er) T(n+a) 


~ T@ra+d’ 


provided a is neither 0 nor a negative integer. (When a € {0, —1, ...}, the coefficients 
Cr 
n 


(11) 


n 


) eventually vanish, so that the asymptotic problem of estimating [z”](1 — z)~% 
becomes void.) The asymptotic analysis of the coefficients (ie) is straightforward, 
by means of Stirling’s formula and real integral estimates: see Notes VI.1 and VI.2. 


A method far more productive than elementary real analysis techniques consists 
in estimating coefficients of a function f(z) by means of Cauchy’s coefficient formula: 


1 d 
K"IS@) = == / fos. 


The basic principle is simple: it consists in choosing a contour of integration y that 
comes at distance 1/n of the singularity z = 1. Under the change of variables 
z = 1+t/n, the kernel z~"~! in the integral transforms (asymptotically) into an 
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Figure VI.2. The contours Co, C;, and C7 = H(n) used for estimating the coeffi- 
cients of functions from the standard function scale. 


exponential, and the function can be expanded locally, with the differential coefficient 
only introducing a rescaling factor of 1 /n: 


t 1 
ZB (: + *) ; dz ' -—dt 
aH rm eT,” (l-z)y*% Bw n*(-t)~*%. 


This gives us for instance (precise justification below): 

1 
~ Qin 
The contour and the associated rescaling capture the behaviour of the function near its 
singularity, thereby enabling coefficient estimation. 


[2"]d -—z)%~ eun® La: e '(—t)~* dt. 


Theorem VI.1 (Standard function scale). Let a be an arbitrary complex number in 
C\ Zeo. The coefficient of z" in 


f@=(-z~ 
admits for large n a complete asymptotic expansion in descending powers of n, 


a-1 


"If @) ~ man 1+ > 4), 
k=1 


where ex is a polynomial in a of degree 2k. In particular?: 


: a no! a(a—1) a(a—l1)(a—2)B3a-1) 
iia) "IS @) “(1+ al An? 
a” (a — 1)* (a — 2) (a — 3) i, 1 
2 48 n3 y (=.)). 


2The quantity e; is a polynomial in @ that is divisible by a(a — 1)--- (a —k), in accordance with the 
fact that the asymptotic expansion terminates when a € Zo. The factor 1/ I'() vanishes identically when 
a € Z<o, in accordance with the fact that coefficients are asymptotically 0 in that case. 
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Proof. The first step is to express the coefficient [z”](1 — z)~% as a complex integral 
by means of Cauchy’s coefficient formula, 


1 _q a 

(14) fa= so [0-2 Sp 
where C is a small enough contour that encircles the origin; see Figure VI.2. We can 
start with C = Co, where Cp is the positively oriented circle Co = {z, |z| = 5}. The 
second step is to deform Co into another simple closed curve C, around the origin 
that does not cross the half-line R(z) > 1: the contour C; consists of a large circle 
of radius R > 1 with a notch that comes back near and to the left of z = 1. Since 
the integrand along large circles decreases as O(R~"), we can finally let R tend to 
infinity and are left with an integral representation for f,, where C has been replaced 
by a contour C2 that starts from +oo in the lower half-plane, winds clockwise around 
1, and ends at +00 in the upper half-plane. The latter is a typical case of a Hankel 
contour. A judicious choice of its distance to the half-line R;, yields the expansion. 

To specify precisely the integration path, we particularize C2 to be the contour 
H(n) that passes at a distance i from the half line Rs): 


(15) H(n) = H(n) U Ht (n) UH (n) 
where 

H(n) = {c=w—-f,w2>]} 
(16) Ht(n) = {z=wtt,w>}} 

Wn) = {c=1-&, pel[-#, E}}. 


Now, a change of variable 
t 
(17) z=14+- 
n 


in the integral (14) gives the form 


no! t —n—1 
(18) n= [com (1++) ai. 


(The Hankel contour 7 winds about 0, being at distance 1 from the positive real axis; 
it is the same as the one in the proof of Theorem B.1, p. 745.) 

We have the asymptotic expansion 
(19) 


—n—1 2 4 3 2 
(1+-) — e Mtl) log(+t/n) _ 5 fee! Sieg SANT we], 
n 


2n 24n2 


which tells us that the integrand in (18) converges pointwise (as well as uniformly 
in any bounded domain of the t plane) to (-t)~%e~'. Substitution of the asymptotic 


form 
t —n—-1 1 
(+5) -<*(+0(5)). 
n n 
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as n — ov inside the integral (18) suggests (formally) that 


a-1 
["Jd-z% = oF [comet (1+0 (<)) 


ne—! 1 
= ray (1+ (5): 


when use is made of Hankel’s formula for the Gamma function (p. 745). 
To justify this formal argument, we proceed as follows: 


(i) Split the contour H according to R(t) < log? n and R(t) > log” n, as in the 
corresponding diagram: 
log? n 
(20) -0 2, 


(ii) Verify that the part corresponding to R(t) > log’ n is negligible in the scale 
of the problem; for instance: 


t —n 
(: + “) = O(exp(— log” n)) for R(t) > log” n. 
n 


(iii) Use a terminating form of (19) to develop an expansion to any predeter- 
mined order, with uniform error terms, for the part corresponding to R(t) < 
log’ n. (This is possible because t/n = O(log” n/n) is small.) 


These considerations validate term-by-term integration of expansion (19) within the 
integral of (18), so that the full expansion of f,, is determined as follows: a term of 
the form ¢” /n* in the expansion (19) induces, by Hankel’s formula, a term of the form 
n-*/Y(a—r). (The expansion so obtained is non-degenerate provided a differs from 
a negative integer or zero; see also Note VI.3 for details.) Since 


1 
Ta—k Ta) 


the expansion in the statement of the theorem eventually follows. a 


(a — 1)(a — 2)---(a —k). 


The asymptotic approximations obtained from Theorem VI.2 differ from the ones 
that are associated with meromorphic asymptotics (Chapter IV), where exponentially 
small error terms could be derived. However, it is not uncommon to obtain results with 
about 10~° accuracy, already for values of n in the range 10!—10? with just a few terms 
of the asymptotic expansion. Figure VI.3 exemplifies this situation by displaying the 
approximations obtained for the Catalan numbers, 


qn 
n+1 


when Cj9, C20, Cso are considered and up to eight asymptotic terms are taken into 
account. 


Ci = [2710 —z) 1, 
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n=10 n=20 n=50 
4” 
(1 18708 6935533866 2022877684829178931751713264 
Tw n> 
—3n! 16603 6545410086 197 7362936920522405787299715 
+5907? 16815 6565051735 19782 79553371460627490749710 
~ipgn 16794 6564073885 1978261 300061 101426696482732 
+353 n—4 16796 6564122750 19782616 64919884629357813591 
5231? ~—- 16796 6564120 303 1978261657 612856326190245636 
+7 n~© — 16796 6564120426 197826165775 9023715384519184 
— 3834335 2’) 16796 6564120420 19782616577561 03402179527600 
Cr 16796 6564120420 —1978261657756160653623774456 


Figure VI.3. Improved approximations to the Catalan numbers obtained by succes- 
sive terms of their asymptotic expansion (with exact digits in boldface). 


> VILLI. Stirling’s formula and asymptotics of binomial coefficients. The Gamma function 
form (11) of the binomial coefficients yields 


no—l 1 
"10 -z)% = T@) (1 ob oc) ; 


when Stirling’s formula is applied to the Gamma factors. dq 
> VI.2. Beta integrals and asymptotics of binomial coefficients. A direct way of obtaining the 


general asymptotic form of (Cre) bases itself on the Eulerian Beta integral (see [604, p.254] 


and Appendix B.3: Gamma function, p. 743). Consider the quantity (a > 0) 
(n — 1)! = 1 
a(at+1)---(atn—-1) luge 


1 
b(n, a) = i, me (Nena 9 amr a 


where the second form results elementarily from successive integrations by parts. The change 
of variables t = x/n yields 


1 n 1 ae J E 
d(n,a) = =f x@-l(—x/n)""! dt ~ =f x21 e-* dy = (@) 
n 0 0 


n>oo n@& 


where the asymptotic form results from the standard limit formula of the exponential: exp(a) = 
limp—yo0(1 + a/n)". 


> VI.3. Computability of full expansions. The coefficients ex of Theorem VI.1 satisfy 


2k 
ek = y; Ax e(a — \)(a —2)-+- (a — £), 
t=k 
where Ake ‘= [ore (A +or)7!-l/e, zi 
> VL.4. Oscillations and complex exponents. Oscillations occur in the case of singular expan- 


Ei 


sions involving complex exponents. From the consideration of [z”](1 — z)*! x n*-1 one 


finds i : ; 
P 
[z”] cos { log = (log) + O(—s), 
1-z n2 


n 
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where P(u) is a continuous and 1-periodic function. In general, such oscillations are present in 
[z”]( — z)~@ for any non-real a. dq 


Logarithmic factors. The basic principle underlying the method of proof of The- 
orem VI.1 (see also the summary Equation (12)) has the advantage of being easily 
extended to a wide class of singular functions, most notably the ones that involve 
logarithmic terms. 


Theorem VI.2 (Standard function scale, logarithms). Let a be an arbitrary complex 
number in C \ Z<o. The coefficient of z" in the function? 


tf ll 1 \ 
f(z) = (U-—z)™ { = log 
g 1-z 
admits for large n a full asymptotic expansion in descending powers of log n, 


Ci C2 
log n log? n : 


a-l 
(21) fn = [z"1f@) ~ co (logn)? a 


k 
where Cy = Qras 0) 


s=a. 
Proof. The proof is a simple variant of that of Theorem VI.1 (see [248] for details). 
The basic expansion used is now 


A) ) 2) ela) 


_1)\F 
~ e!(—t)~%n" (logn)? (: - eet) ) 
logn 
2 
~ en" (ogny! (: = pe) 5 AE (eD) -), 
logn 2! logn 


Again, we are justified in using this expansion inside Cauchy’s integral representa- 
tion of coefficients. What comes out from term-by-term integration is a collection of 
Hankel integrals of the form 


Ll einen =." | 21 ejrta 
— —t) °e‘' (log(— = (-1*—|]— e 
21m J4oo Si Wi 2h ie ies 
ana 
SUC gaa: 
ds* T(s) 
where the reduction to derivatives of 1/I'(s) results from differentiation with respect 
to s under the integral sign. | 


A typical example of application of Theorem VI.2 is the estimate 


2") 1 1 1 (1 pe + 0( 1 )) 
Z = — 
J1— 1 log u Jin logn logn log? n 


1—z 


3A coefficient of 1 /z is introduced in front of the logarithm since log(1 — goiect+ O(z2): in this 
way, f(z) is a bona fide power series in z, even when f is not an integer. Such a factor does not affect 
asymptotic expansions in a logarithmic scale near z = 1. 
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a ¢ {0, —1, —2,...} (Eq.) a e€{0,- aes ed 1 (Eq.) 
al oo 
u ~ pon! B 
BEZs0| Ta 2 Togs (21) | fa ~ n*—! (log n) 3 a (24) 
ntl & Ej (logn) a-l Fj oun 
BeZs) TH 2 7 (25) eae (27) 


Figure VI.4. The general and special cases of fy = [z"] f(z) when f(z) is as in 
Theorem VI.2. 


(Such singular functions do occur in combinatorics and analysis of algorithms [257].) 
> VL.5. Singularity analysis of slowly varying functions. A function A(u) is said to be slowly 
varying towards infinity (in the complex plane) if there exists a g € (0, 5) such that, for any 
fixed c > 0 and all @ satisfying |0| < a — ¢, there holds 
A(ce?u) 

im = —— = 
u>+too A(u) 
(Powers of logarithms and iterated logarithms are typically slowly varying functions.) Under 
uniformity assumptions on (22), the following estimate holds [248]: 


(22) 


23 ma—z-ea(_} ee 
(23) e120 (A) ~ Fae. 


For instance, we have: 


ie ( : log 4) exp (Viog n) 


Vl—z J/Tn 
See also the discussion of Tauberian theory, p. 435. dq 


> VI.6. Iterated logarithms. For a general a ¢ Z<o, the relation (23) admits as a special case 
1 1 \F (1 1 Vas ae aa 
[2] — z)% ( log ) ( log ( log )) nee (log n)P (log log n). 
z  1l-z z z  1-z T(a) 
A full asymptotic expansion can be derived in this case. dq 


Special cases. The conditions of Theorems VI.1 and VI.2 exclude explicitly the 
case when a is a negative integer: the formulae actually remain valid in this case, 
provided one interprets them as limit cases, making use of 1/[(0) = 1/T(-1) = 

- = 0. Also, when f is a positive integer, the expansion of Theorem VI.2 terminates: 
in that situation, stronger forms are valid. Such cases are summarized in Figure VI.4 
and discussed below. 


The case of integral a € Zeo and general f ¢ Zso. When a is a negative 
integer, the coefficients of f(z) = (1 — z)~®% eventually reduce to zero, so that the 
asymptotic coefficient expansion becomes trivial: this situation is implicitly covered 
by the statement of Theorem VI.1 since, in that case, 1/I'(a) = 0. When logarithms 
are present (with a € Z<o still), the expansion of Theorem VI.2 regarding 


1 1 \f 
fQy=G=2)" (<toe i ) 
Zz —e 
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remains valid provided we again take into account the equality 1/I'(a) = 0 in for- 
mula (21) after effecting simplifications by Gamma factors: it is only the first term 
of (21) that vanishes, and one has 


D D 
(24) [2"1f@) ~ n* (logn)? te | : 
logn — log*n 
i B\ de 1 
where Dy, is given by Dg = — . For instance, we find 
Ry as” TG), 
Zz 1 2y 1 


[2] = 


z =— ae +0 : 
log(1 — z)7! nlog?n nlog>n S log’ a 


The case of general a ¢ Zeo and integral B € Zso9. When f = k is a non- 
negative integer, the error terms can be further improved with respect to the ones 
predicted by the general statement of Theorem VI.2. For instance, we have: 


1 1 
] = 
ee a ae 


logn 
logn + y +2log2 + O(——) }. 
n 


[cl We 


1 
+ O(5) 


1 
log —_- ~ —— 
T-z 1-2 Van 
(In such a case, the expansion of Theorem VI.2 terminates since only its first (k + 1) 


terms are non-zero.) In fact, in the general case of non-integral a, there exists an 
expansion of the form 


no—! 


1 E\(logn 
05) fz" —2)"*ogé — ~ =| go(logny + 28” 4... 
1-z Ia) n 
where the E; are polynomials of degree k, as can be proved by adapting the argument 


employed for general a (Note VI.8). 


The joint case of integral a € Z<o and integral B € Zso. If a is a negative inte- 
ger, the coefficients appear as finite differences of coefficients of logarithmic powers. 
Explicit formulae are then available elementarily from the calculus of finite differ- 


ences when f is a positive integer. For instance, with a = —m form € Zso, one 
has 
26) [<"1(1 — 2)" log —— = (-1)"9——™ 

1-z n(n—1)---(n—m) 


The case a = —m and Bf = k (with m,k € Zs) is covered by (28) in Note VL7 
below: there is a formula analogous to (25), 


2. 


1 F\ (lo 
(27) [z"](Q. — z)™ log* ear Yada | Fotos n) + FiQogn) aoa | 
2 n 
but now with deg(F;) =k — 1. 
Figure VI.5 provides the asymptotic form of coefficients of a few standard func- 
tions illustrating Theorems VI.1 and VI.2 as well as some of the “special cases”. 
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Function coefficients 
1 3 45 1155 1 
1 —z)3/2 O 
came) Jae 0 an se 
(1-2) (0) 
1 1. 3 1 
1—2)!/2 - + —+ +0 
4) Sea ten ase 
1 1 y +2log2—2 logn 
1-z)/L = 1 O 
(i —2z) (z) Trad 2 ogn+ 5) + O( »)) 
Gis i ge ES) 
25 = 
3r (3)n4/3 9n  81n2 n3 
1 2y 2 — by? 1 
z/L(@ (-1+ + O( 
es, nlog? n logn —_ 2log?-n log? n » 
1 (0) 
-| 1 
log(1 — z) = 
n 
ag 1 1 1 1 
log*(1 — z) — log na 2y Si et Ol) 
n n 6n n 
Gaa-e (1+ O(-)) 
res. 7m 
1 1 1 5 1 
1—z)7!/2 1-—+ + +0 
c=) Gea Be Fe ona nd 
_ 1 logn+y +2log2 logn 
papi l 2log2 — 
(Tz) (z) ane og Sn 72) 
(l—z)7} 1 
(—z)7!L(@) ‘| logn + ne : + ; +0(-=)) 
= an a 
nies BEES On Tone” (O0n" no 
-1 2 By 2 x2 logn 
(l—z)° L(&) log*n+2y logn+y mer eel ee, 
3 7 1 
1-z)73/? [Ze - 
(l1=2) Or a ae Oh) 
31 1 
(1 —z)-3/2L@) [2 etogn +27 +4log2-4+4 -—28" + o(-)) 
1 An n 
(1—z)~? n+l 
1 1 
(l—z)7L(&)_ | nlogn+ (y —1)n+logn+ 5 trrOe) 
2: 
lo 
(1—z)72L(z)?_ | n(log2n + 2(y — 1) logn + yp? —2y +2- + 0( Ee) 


G=77° 


sn? +3n+1 


n 


Figure VI.5. A table of some commonly encountered functions and the asymptotic 
forms of their coefficients. The following abbreviation is used: 


1 
L(z) := log i 
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> VI.7. The method of Frobenius and Jungen. This is an alternative approach to the case 
B € Zyo (see [360]). Start from the observation that 


k k 
- @ . 
(1-2) “ (tos —) ae es 


then let the operators of differentiation (0/0a ) and coefficient extraction ([z’]) commute (this 
can be justified by Cauchy’s coefficient formula upon differentiating under the integral sign). 
This yields 


: Ds 1\F a rat+a) 
(28) [z"]C — z) (108 i —) ~ Gak F@ran +1)’ 


which leads to an “exact” formula (Note VI.8 below). J 


> V1.8. Shifted harmonic numbers. Define the a-shifted harmonic number by 


n—-1 1 
hn) = Ds : 
j=0/ + % 
With L(z) := —log(1 — z), still, one has 
nt+a-1 
era-eL@ = ("Tawa 


[11 —2)-4L@)2 = (“ee ) (ni, (a) + hn(a)?) . 
n 


(Note: hy(a) = w(a +n) — w(a), where y(s) := ds log F'(s).) In particular, 


2") 1 \ 1 1 (2n DH Hy] 
te) = - F 
z Ta gs i a\ » 2n n 
where Hy = An (1) is the usual harmonic number. <J 


VI.3. Transfers 


Our general objective is to translate an approximation of a function near a sin- 
gularity into an asymptotic approximation of its coefficients. What is required at this 
stage is a way to extract coefficients of error terms (known usually in O(-) or o(-) 
form) in the expansion of a function near a singularity. This task is technically simple 
as a fairly coarse analysis suffices. As in the previous section, it relies on contour inte- 
gration by means of Hankel-type paths; see for instance the summary in Equation (12), 
p. 381, above. 

A natural extension of the approach of the previous section is to assume the error 
terms to be valid in the complex plane slit along the real half line R;,. In fact, weaker 
conditions suffice: any domain whose boundary makes an acute angle with the half 
line R;; appears to be suitable. 


Definition VI.1. Given two numbers ¢, R with R > 1 and0 < ¢ < 4, the open 
domain A(q, R) is defined as 


A(d, R) = {z | |zl < R, z#1, larg(z— 1) > 4}. 


A domain is a A—domain at 1 if it is a A(g@, R) for some R and ¢. For a complex 
number ¢ #0, a A-domain at ¢ is the image by the mapping z > ¢z of a A—domain 
at 1. A function is A—analytic if it is analytic in some A—domain. 
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“ay 


Figure VI.6. A A—domain and the contour used to establish Theorem VI.3. 


Analyticity in a A—domain (Figure V1.6, left) is the basic condition for transfer 
to coefficients of error terms in asymptotic expansions. 
Theorem VI.3 (Transfer, Big-Oh and little-oh). Let a, B be arbitrary real numbers, 
a, B € Rand let f(z) be a function that is A—analytic. 

(i) Assume that f (z) satisfies in the intersection of a neighbourhood of \ with its 
A-domain the condition 


1 
f@=O (a Sa) OB ae *) 


Then one has: [e"1f (2) = O(n"! (logn)’). 
(ii) Assume that f(z) satisfies in the intersection of a neighbourhood of \ with 
its A—domain the condition 


1 
f@)=0 (« — z) “(log ) ; 


1-z 


Then one has: [2" f(z) = o(n?! ogn)*). 
Proof. ({) The starting point is Cauchy’s coefficient formula, 


1 d 
hekey@=s- [ross 
Y 


where y is any simple loop around the origin which is internal to the A—domain of f. 
We choose the positively oriented contour (Figure V1.6, right) y = yy Uy2U y3 U ya, 
with 


y= {: | jz-lj= ~,Jargte — 1) > 01} (inner circle) 

y= | Z | ~ <|z-ll, lel <7 arg -1) = o| (top line segment) 

93. Se | Izl=r, larg(z — 1] = a} } (outer circle) 

wm = | Z | “ <|z-ll, lel <7, arg —-1)= -0| (bottom line segment). 
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If the A domain of f is A(¢, R), we assume that | <r < R, and ¢ < 6 < 4, so that 
the contour y lies entirely inside the domain of analyticity of f. 
For j = 1, 2, 3,4, let 


(j) 1 i dz 
= 2 a arr ie 
Le in - FC ) Fl 


The analysis proceeds by bounding the absolute value of the integral along each of 
the four parts. In order to keep notations simple, we detail the proof in the case where 


B=0. 


(1) Inner circle (y). From trivial bounds, the contribution from yj, satisfies 


UP = OC). o(()’) =0(n'), 


as the function is O(n%) (by assumption on f(z)), the contour has length 


O(n~'), and z~"—! remains O(1) on this part of the contour. 
(2) 


(2) Rectilinear parts (y2, y4). Consider the contribution f/f,” arising from the 
part y2 of the contour. Setting @ = e!?, and performing the change of 
variable z = 1+ “, we find 

Lf? EY ot |"! 
FAge= al x (+) 1+—| dt, 
2a Ji n n 
for some constant K > Osuch that | f(z)| < K(1—z)~% over the A—domain, 
which is granted by the growth assumption on f. From the relation 
cot ot t 
+ —/|>1+R8(—) =1+ -cos8, 
n n n 
there results the inequality 
K 2 tcos0\~” 
[fF] <—Jyn*-!, where J, = / 1-2 (i + et dt. 
2a 1 n 
For a given a, the integrals J, are all bounded above by some constant since 
they admit a limit as n tends to infinity: 
CO 
Jn > i fg TOM de: 
1 
The condition on @ that 0 < 6 < 2/2 precisely ensures convergence of the 
integral. Thus, globally, on the part y2 of the contour, we have 
[fn] = O(n"), 
A similar bound holds for 6 relative to y4. 
(3) Outer circle (y3). There, f(z) is bounded while z~” is of the order of r~”. 


Thus, the integral f° is exponentially small. 
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In summary, each of the four integrals of the split contour contributes O(n?—!). The 
statement of part (i) of the theorem thus follows, when £ = 0. Entirely similar 
bounding techniques cover the case of logarithmic factors (6 4 0). 


(ii) An adaptation of the proof shows that o(.) error terms may be translated 
similarly. All that is required is a further break-up of the rectilinear part at a distance 
log” n/n from 1 (see the discussion surrounding Equation (20), p. 383 or [248] for 
details). | 


An immediate corollary of Theorem VI.3 is the possibility of transferring asymp- 
totic equivalence from singular forms to coefficients: 


Corollary VI.1 (sim-transfer). Assume that f(z) is A—analytic and 
Fay)", asz— 1, zeA, 


witha ¢ {0, —1, —2,---}. Then, the coefficients of f satisfy 


no! 


Fail i Bi Ta)’ 


Proof. It suffices to observe that, with g(z) = (1 — z)~, one has 


FQ) eG) iff f(z) =g(z) + 0(g(z)), 


then apply Theorem VI.1 to the first term, and Theorem V1.3 (little-oh transfer) to the 
remainder. a 


> VL9. Transfer of nearly polynomial functions. Let f(z) be A—analytic and satisfy the sin- 
gular expansion f(z) ~ (1 — z)’, where r € Zs. Then, fy = o(n—"—!), [This is a direct 
consequence of the little-oh transfer. ] dq 
> VI.10. Transfer of large negative exponents. The A—analyticity condition can be weakened 


for functions that are large at their singularity. Assume that f(z) is analytic in the open disc 
|z| < 1, and that in the whole of the open disc it satisfies 


f@) = O(1-z)™). 
Then, provided a > 1, one has 
["1f@) = O71). 
[{Hint. Integrate on the circle of radius 1 — a see also [248].] J 


VI. 4. The process of singularity analysis 


In Sections VI.2 and VI.3, we have developed a collection of statements grant- 
ing us the existence of correspondences between properties of a function f(z) sin- 


gular at an isolated point (z = 1) and the asymptotic behaviour of its coefficients 


fn = [z"| f(z). Using the symbol ‘—>’ to represent such a correspondence*, we 


4The symbol “==>” represents an unconditional logical implication and is accordingly used in this 
book to represent the systematic correspondence between combinatorial specifications and generating func- 
tion equations. In contrast, the symbol ‘—>’ represents a mapping from functions to coefficients, under 
suitable analytic conditions, like those of Theorems VI.1—VI.3. 
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can summarize some of our results relative to the scale {(1 — z)~*, a € C\ Z<o} as 


follows: 
f@)=(A-z)% — fx=——~+4::: (Theorem VI.1) 
P(a) 
f(@)=O(1-2)7*%) > fy = O(n") (Theorem VI.3 (i)) 
—> 


f(@) =o0(1 —z)%) 


no! 


fn = 0(n*—!) (Theorem VL3 (ii)) 
P ne! 

Lao Zz) meme f Ta) 
The important requirement is that the function should have an isolated singularity (the 
condition of A—analyticity) and that the asymptotic property of the function near its 
singularity should be valid in an area of the complex plane extending beyond the disc 
of convergence of the original series, (in a A—domain). Extensions to logarithmic 
powers and special cases like a € Z<g are also, as we know, available. We let S 
denote the set of such singular functions: 


(Corollary VI.1). 


(29) S={(l- z)~*A(z) | «,B eC}, A(z) i= “log = ~L(), 


1—z 
At this stage, we thus have available tools by which, starting from the expansion 
of a function at its singularity, also called singular expansion, one can justify the term- 
by-term transfer from an approximation of the function to an asymptotic estimate of 
the coefficients>. We state the following theorem. 
Theorem VI.4 (Singularity analysis, single singularity). Let f(z) be function analytic 
at 0 with a singularity at ¢, such that f(z) can be continued to a domain of the form 
¢ - Ao, for a A—domain Ao, where ¢ - Ao is the image of Ao by the mapping z > €z. 
Assume that there exist two functions 0, t, where o is a (finite) linear combination of 
functions in S and t € S, so that 


fE=9 @/CV+OME/E)) as z>G in ¢- Ao. 

Then, the coefficients of f (z) satisfy the asymptotic estimate 
fn = aie F OC "ty > 

where on = [z"]o (z) has its coefficients determined by Theorems VI.1, V1.2 and t* = 
n*\(logny?, if t(z) = 1 — z)-#A@)?. 
We observe that the statement is equivalent to tx = [z”]t(z), except when a € Z<o, 
where the 1/T(a) factor should be omitted. Also, generically, we have 1* = o(o,), 
so that orders of growth of functions at singularities are mapped to orders of growth 
of coefficients. 
Proof. The normalized function g(z) = f(z/C) is singular at 1. It is A—analytic and 
satisfies the relation g(z) = o(z) + O(t(z)) as z % 1 within Ag. Theorem V1.3, (i) 
(the big-Oh transfer) applies to the O-error term. The statement follows finally since 
[2"If @) = ¢-"[z"le@). a 


5Functions with a singularity of type (1 — z)~%, possibly with logarithmic factors, are sometimes 
called algebraic—logarithmic. 
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Let f(z) be a function analytic at 0 whose coefficients are to be asymptotically analysed. 
1. Preparation. This consists in locating dominant singularities and checking analytic 
continuation. 
la. Locate singularities. Determine the dominant singularities of f(z) (assumed 
not to be entire). Check that f(z) has a single singularity ¢ on its circle of 
convergence. 
1b. Check continuation. Establish that f(z) is analytic in some domain of the 
form ¢ Ao. 
2. Singular expansion. Analyse the function f(z) as z > ¢ in the domain ¢ - Ag and 
determine in that domain an expansion of the form 


f@) = 7@/) + O/C) — with r(z) = of@ (2). 
For the method to succeed, the functions o and rt should belong to the standard scale of 
functions S = {(1 — z)7#A(z)P}, with A(z) := z—! log(1 — z)7!. 


3. Transfer Translate the main term term o(z) using the catalogues provided by 
TheoremsVI.1 and VI.2. Transfer the error term (Theorem VI.3) and conclude that 


"1f@ = oon + O(C"m), 


s+ 


where on = [z”]o(z) and t% = [z"']r(z) provided the corresponding exponent a ¢ Z<o 
(otherwise, the factor 1/ (a) = 0 should be dropped). 


Figure VI.7._ A summary of the singularity analysis process (single dominant singularity). 


The statement of Theorem VI.4 can be concisely expressed by the correspon- 
dence: 
30) f@ = o@/)+OCR/)) > fr = o"on+ OCT). 
z> 1 n—-0o 


os 
The conditions of analytic continuation and validity of the expansion in a A—domain 
are essential. Similarly, we have 


B)  f@) =e G/N +e) > fa, So "on FOC th), 


= > 00 


as a simple consequence of Theorem VI.3, part (ii) (little-oh transfer). The map- 
pings (30) and (31) supplemented by the accompanying analysis constitute the heart 
of the singularity analysis process summarized in Figure VI.7. 


Many of the functions commonly encountered in analysis are found to be A— 
analytic. This fact results from the property of the elementary functions (such as |/, 
log, tan) to be continuable to larger regions than what their expansions at 0 imply, as 
well as to the rich set of composition properties that analytic functions satisfy. Fur- 
thermore, asymptotic expansions at a singularity initially determined along the real 
axis by elementary real analysis often hold in much wider regions of the complex 
plane. The singularity analysis process is then likely to be applicable to a large num- 
ber of generating functions that are provided by the symbolic method—most notably 
the iterative structures described in Section IV. 4 (p. 249). In such cases, singularity 
analysis greatly refines the exponential growth estimates obtained in Theorem IV.8 
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(p. 251). The condition is that singular expansions should be of a suitably moderate® 
growth. We illustrate this situation now by treating combinatorial generating functions 
obtained by the symbolic methods of Chapters I and II, for which explicit expressions 
are available. 


Example V1.2. Asymptotics of 2-regular graphs. This example completes the discussion of 
Example VI.1, p. 379 relative to the EGF 


en 2/2-27/4 
JI-z 


We follow step by step the singularity analysis process, as summarized in Figure VI.7. 


R(Z)= 


1. Preparation. The function R(z) being the product of e~*/ 2—27/4 (that is entire) and 
of (1 — z)~!/2 (that is analytic in the unit disc) is itself analytic in the unit disc. Also, since 
ad- zy l/ 2 is A-analytic (it is well-defined and analytic in the complex plane slit along Rs 1), 
R(z) is itself A—analytic, with a singularity at z = 1. - 

2. Singular expansion. The asymptotic expansion of R(z) near z = | is obtained starting 
from the standard (analytic) expansion of 7 2/2-2°/4 at Z=1, 

-3/4 —3/4 


e 2 e 
1 _ 
ri (1 —z) 


en /2-2/4 _ 9-3/4 4 3/4) — zy + 


The factor (1 — zy 2 is its own asymptotic expansion, clearly valid in any A—domain. Per- 
forming the multiplication yields a complete expansion, 


/4 
(32) R() ~ i ee 


o73/4 
4 


-3/4 
CE ar et eee 


out of which terminating forms, with an O-error term, can be extracted. 

3. Transfer. Take for instance the expansion of (32) limited to two terms plus an error 
term. The singularity analysis process allows the transfer of (32) to coefficients, which we can 
present in tabular form as follows: 


R@) cn = [z"]R() 
e73/4 1 ea” - 7) a gue [ ma ! ce | Spas | 
ror -1/2 Jan 8n — 128n2 


7 _ —3/2\  -e3/4 3 
3/4 3/4(" ~~ je ene ee 
te A/4,/T =z | +e a) [1+ 5+ 


_ 23/2 a 
+ O((1 -z)?/”) +0(sp 


Terms are then collected with expansions suitably truncated to the coarsest error term, so that 
here a three-term expansion results. In the sequel, we shall no longer need to detail such com- 
putations and we shall content ourselves with putting in parallel the function’s expansion and 
the coefficient’s expansion, as in the following correspondence: 


3/4 a ma 03/4 5e-3/4 1 
RQ) = Rte VT =z40(a-9°7) oom eee? (js): 


6For functions with fast growth at a singularity, the saddle-point method developed in Chapter VIII 
becomes effectual. 
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(1) 


Here is a numerical check. Set cy,’ := e 3/4 /Jmn and let ce) represent the sum of the first 
two terms of the expansion of cn. One finds: 


n|5 50 500 

nic) | 14.30212 1.1462888618 - 10° 1.4542120372 - 101132 

nic | 12.51435 1.1319602511- 10° 1.4523942721 - 191132 
ncn | 12 1.1319677968 - 10° — 1.4523943224 - 101132 


Clearly, a complete asymptotic expansion in descending powers of n can be obtained in this 
WAY! creiee ae Screg atene Mh cous: Sie ale Seas Seiden wtetech hos eaals meneeie Bae bere due cle Ns edisud bvrete ah A ee das bee | 


Example V1.3. Asymptotics of unary—-binary trees and Motzkin numbers. Unary-binary trees 
are unlabelled plane trees that admit the specification and OGF: 


U=Z0+U+UxU) => jes = G+ 2G = 32) 


2z 
(See Note I.39 (p. 68) and Subsection V. 4 (p. 318) for the lattice path version.) The GF U(z) 
is singular at z = —1 and z = 1/3, the dominant singularity being at z = 1/3. By branching 


properties of the square-root function, U(z) is analytic in a A-domain like the one depicted 
below: 


Around the point 1/3, a singular expansion is obtained by multiplying (1 — 3z)1/ 2 and the 
analytic expansion of the factor (1 + z)!/2/(2z). The singularity analysis process then applies 
and yields automatically: 


U@) =1-31?V1=324+ O(1-32)) Un =f ag" + 06"), 
un 


Further terms in the singular expansion of U(z) at z = 1/3 provide additional terms in the 
asymptotic expression of the Motzkin numbers U,,; for instance, the form 


U, = () 3 ar(i 15 505 8085 i 505659 +0(<)) 
"V 4an3 lon  512n? 81923 —-524288n4 n> 
results from an expansion of U(z) till O((1 — 3z) pty ay. The approximation provided by the first 
three terms is quite good: for n = 10, it estimates f{g = 835. with an error less than 1. .... HH 


> VLI1. The population of Noah’s Ark. The number of one-source directed lattice animals 
(pyramids, Example I.18, p. 80) satisfies 


1 1+z 3” 1 1 
Ph = [z”"]= -1lj= 1—- O : 
meal 15( 1-32 ) alam + (sa) 
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The expected size of the base of a random animal in A, is ~ wit. What is the asymptotic 
number of animals with a compact source of size k? dq 


Example V1.4. Asymptotics of children’s rounds. Stanley [550] has introduced certain combi- 
natorial configurations that he has nicknamed “children’s rounds”: a round is a labelled set of 
directed cycles, each of which has a centre attached. The specification and EGF are 


1 a 
=) =0-2 * 


The function R(z) is analytic in the C-plane slit along R51, as is seen by elementary properties 
of the composition of analytic functions. The singular expansion at z = 1 is then mapped to an 
expansion for the coefficients: 


R= SET(Z*Cyc(Z)) = R(z)=exp (: log 


Rig = +log1—2+0(1-9'7) — [2"1R@ =1- + 0(n-3/2), 


A more detailed analysis yields 


1 1 log” 
[z"]R(@) =1-——— —(logn+y — b+ 0/ *). 
n n n 


and an expansion to any order can be easily obtained. .......... 0.0... e cece eee ee eee ee eee | 
> VI.12. The asymptotic shape of the rounds numbers. A complete asymptotic expansion has 
the form 

P; (log n) 


n ~ _— 
"IR@)~1->° 7 
j2l 
where P; is a polynomial of degree j — 1. (The coefficients of P; are rational combinations of 
powers of y, (2), ..., ¢(j — 1).) The successive terms in this expansion are easily obtained by 


a computer algebra program. dq 


Example V1.5. Asymptotics of coefficients of an elementary function. Our final example 
is meant to show the way rather arbitrary compositions of basic functions can be treated by 
singularity analysis, much in the spirit of Section IV. 4, p. 249. Let C = Z x SEQ(C) be the class 
of general labelled plane trees. Consider the labelled class defined by substitution 


F =CoCyc(Cyc(2)) => F(z) =C(L(L(&))). 


There, C(z) = xl — J/1 — 4z) and L(z) = log i Combinatorially, F is the class of trees 
in which nodes are replaced by cycles of cycles, a rather artificial combinatorial object, and 


1 1 
Fa) =5[1- | — 410g — |. 
2 | 1 —log 


The problem is first to locate the dominant singularity of F(z), then to determine its nature, 
which can be done inductively on the structure of F(z). The dominant positive singularity p of 
F(z) satisfies L(L(p)) = 1/4 and one has 


-1/4_4 


p=1-e = 0.198443, 


given that C(z) is singular at 1/4 and L(z) has positive coefficients. Since L(L(z)) is analytic at 
p, a local expansion of F(z) is obtained next by composition of the singular expansion of C(z) 
at 1/4 with the standard Taylor expansion of L(L(z)) at p. We find 


—n+1/2 
F(zZ)= 5—Cule-2)"/240 ((p - 2?) —> [")F@= ea +0 (-)| ‘ 
umn 
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. S_1o-1/4 , 
with C) = e8~2 T1650. Mek stem Paneth eM cie clle gs Ra MU rin ete ce eet Pt Ge | 


> VI.13. The asymptotic number of trains. Combinatorial trains were introduced in Sec- 
tion IV.4 (p. 249) as a way to exemplify the power of complex asymptotic methods. One 
finds that, at its dominant singularity p, the EGF 7r(z) is of the form Tr(z) ~ C/(1 —z/p), and, 
by singularity analysis, 

[z"]Tr(z) ~ 0.11768 31406 15497 - 2.06131 73279 40138”. 


(This asymptotic approximation is good to 15 significant digits for n = 50, in accordance with 
the fact that the dominant singularity is a simple pole.) dq 


VI.5. Multiple singularities 


The previous section has described in detail the analysis of functions with a single 
dominant singularity. The extension to functions that have finitely many (by necessity 
isolated) singularities on their circle of convergence follows along entirely similar 
lines. It parallels the situation of rational and meromorphic functions in Chapter IV 
(p. 263) and is technically simple, the net result being: 


In the case of multiple singularities, the separate contributions from each of 
the singularities, as given by the basic singularity analysis process, are to 


be added up. 
As in (29), p. 393, we let S be the standard scale of functions singular at 1, namely 
1 1 
S={(-z) 7A? | aBeC}, Aw = log 


Theorem VI.5 (Singularity analysis, multiple singularities). Let f(z) be analytic in 
|z| < p and have a finite number of singularities on the circle |z| = p at points 
Cj = peli, for j =1..r. Assume that there exists a A-domain Ao such that f (z) is 
analytic in the indented disc 


: 
D=({ );- Ao), 
j=l 
with ¢ - Ag the image of Ao by the mapping z > Cz. 
Assume that there exists r functions 0,,...,06,;, each a linear combination of 
elements from the scale S, and a function t € S such that 


f (2) = oj (2/Cj) + O (7(z/¢;)) asz— ¢j inD. 
Then the coefficients of f (z) satisfy the asymptotic estimate 


*: 
fn = DU Cj "Fin + O (0mm), 
j=l 
where each oj.n = [z" ]o;(z) has its coefficients determined by Theorems VI.1, VI.2 
and t* = n?—!dogn)?, if t(z) = (1 — z)~2A(z)?. 
A function analytic in a domain like D is sometimes said to be star-continuable, a 
notion that naturally generalizes A-—analyticity for functions with several dominant 
singularities. Furthermore, a similar statement holds with o-error terms replacing Os. 
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Figure VI.8. Multiple singularities (r = 3): analyticity domain (D, left) and com- 
posite integration contour (y, right). 


Proof. Just as in the case of a single singularity, the proof bases itself on Cauchy’s 
coefficient formula 


d 
fn = (e"] / fom 


where a composite contour y depicted on Figure VI.8 is used. Estimates on each part 
of the contour obey exactly the same principles as in the proofs of Theorems VI.1— 
VI.3. Let y ) be the open loop around ¢; that comes from the outer circle, winds 
about ¢; and joins again the outer circle; let r be the radius of the outer circle. 


(i) The contribution along the arcs of the outer circle is O(r~"), that is, expo- 
nentially small. 
(ii) The contribution along the loop y “) (say) separates into 


1 dz 
sts = I’ I” 
2in th f@) zit * 
1 


1 dz dz 
I := —— ay .: ; 
Pe ie a1(2/¢1) HT? ae (f (2) — 012/61) aH 


The quantity /’ is estimated by extending the open loop to infinity by the 
same method as in the proof of Theorems VI.1 and VI.2: it is found to equal 
¢; "1,n plus an exponentially small term. The quantity 1”, corresponding 
to the error term, is estimated by the same bounding technique as in the 
proof of Theorem VI.3 and is found to be O(p~"z7). 


Collecting the various contributions completes the proof of the statement. | 


Theorem VI.5 expresses that, in the case of multiple singularities, each domi- 
nant singularity can be analysed separately; the singular expansions are then each 
transferred to coefficients, and the corresponding asymptotic contributions are finally 
collected. Two examples illustrating the process follow. 


Example V1.6. An artificial example. Let us demonstrate the modus operandi on the simple 


function 
Z 


(33) (= : 
g(z) a 
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There are two singularities at z = +1 and z = —1, with 


-1 

e e 

g(z) ~ —=—._ z > +1 and g(z) ~ =—— _ 79-1. 
J/2/T—z V2VT #2 

The function is clearly star-continuable with the singular expansions being valid in an indented 

disc. We have 


F e e ‘i e et(-1)" 
[2°] —= ~ = ands "J ~ 
SIS =Z J Inn V2/1 +z J2nn 
To obtain the coefficient [z”]g(z), it suffices to add up these two contributions (by Theo- 
rem VI.5), so that 


1 
n ~ 
[z"]g() Ta 
If expansions at +1 (respectively —1) are written with an error term, which is of the form 
O((z — 1)!/2) (respectively, O((z + 1)!/2), there results an estimate of the coefficients gy = 
[z"]g(z), which can be put under the form 


[e+ (-1)"e7 4]. 


cosh(1) 3/2 sinh(1) —3/2 
= + O(n 2), = +0(n Py. 
82n an 82n+1 an 
This makes explicit the dependency of the asymptotic form of g, on the parity of the index n. 
Clearly a full asymptotic expansion can be obtained. ........... 00... c eee eee eee eee | 


Example V1.7. Permutations with cycles of odd length. Consider the specification and EGF 


ff i 
F =SEt(CyCoa(Z)) = F(@) = exp (5 op 2) = f #2, 
ks este 


The singularities of f are at z = +1 and z = —1, the function being obviously star-continuable. 
By singularity analysis (Theorem VI.5), we have automatically: 
2 +0(a-9'2) &>0 1/2 
a ( =% ) ides? 2 = 
F@)=]) Vviqz > E'NFG@)=F5+0 (x73). 
o(a+2)') Gx=h 


For the next asymptotic order, the singular expansions 


91/2 é 
a Ta OM Hey). ei 
ro-1 7S (a-29?) @>1) 
2712/T+z4+ O(1+2)7/*) (¢ > -1) 
yield 
21/2 (—1)"273/2 
"IF @) = = - + 0(n!?). 
Jn Varn> 
This example illustrates the occurrence of singularities that have different weights, in the sense 
of being associated with different exponents. ............ 0.0 cece cece eee ence ne ene eee | 


The discussion of multiple dominant singularities ties well with the earlier dis- 
cussion of Subsection IV. 6.1, p. 263. In the periodic case where the dominant singu- 
larities are at roots of unity, different regimes manifest themselves cyclically depend- 
ing on congruence properties of the index n, like in the two examples above. When 
the dominant singularities have arguments that are not commensurate to z (a com- 
paratively rare situation), irregular fluctuations appear, in which case the situation is 
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similar to what was already discussed, regarding rational and meromorphic functions, 
in Subsection IV. 6.1. 


VI.6. Intermezzo: functions amenable to singularity analysis 


Let us say that a function is amenable to singularity analysis, or SA for short, 
if its satisfies the conditions of singularity analysis, as expressed by Theorem VI.4 
(single dominant singularity) or Theorem VI.5 (multiple dominant singularities). The 
property of being of SA is preserved by several basic operations of analysis: we have 
already seen this feature in passing, when determining singular expansions of func- 
tions obtained by sums, products, or compositions in Examples VI.2-VL5. 

As a starting example, it is easily recognized that the assumptions of A—analyticity 
for two functions f(z), g(z) accompanied by the singular expansions 


f@ ~,cA-a%,  8@) ~ ad -2”, 


and the condition a, 6 ¢ Z<o imply for the coefficients of the sum 


no—! 
FG) ” a>o 
"If @+a@)~] (c+d—— a=6, ctd¥0 
o-1 NG) 
n 
ITH a<o. 
Similarly, for products, we have 
n2to-l 
[2"1 (fF @)g@)) ~ a4 8)’ 


provided a+ 6 ¢ Zo. 

The simple considerations above illustrate the robustness of singularity analysis. 
They also indicate that properties are easy to state in the generic case where no nega- 
tive integral exponents are present. However, if all cases are to be covered, there can 
easily be an explosion of the number of particular situations, which may render some- 
what clumsy the enunciation of complete statements. Accordingly, in what follows, 
we shall largely confine ourselves to generic cases, as long as these suffice to develop 
the important mathematical technique at stake for each particular problem. 


In the remainder of this chapter, we proceed to enlarge the class of functions 
recognized to be of SA, keeping in mind the needs of analytic combinatorics. The 
following types of functions are treated in later sections. 


(i) Inverse functions (Section VI.7). The inverse of an analytic function is, un- 
der mild conditions, of SA type. In the case of functions attached to simple 
varieties of trees (corresponding to the inversion of y/é(y)), the singular 
expansion invariably has an exponent of 5 attached to it (a square-root sin- 
gularity). This applies in particular to the Cayley tree function, in terms of 
which many combinatorial structures and parameters can be analysed. 
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(ii) Polylogarithms (Section VI. 8). These functions are the generating functions 
of simple arithmetic sequences such as (n”) for an arbitrary 9 € C. The 
fact that polylogarithms are SA opens the possibility of estimating a large 
number of sums, which involve both combinatorial terms (e.g., binomial co- 
efficients) and elements like ./n and logn. Such sums appear recurrently in 
the analysis of cost functionals of combinatorial structures and algorithms. 

(iii) Composition (Section VI. 9). The composition of SA functions often proves 
to be itself SA This fact has implications for the analysis of composition 
schemas and makes possible a broad extension of the supercritical sequence 
schema treated in Section V. 2, (p. 293). 

(iv) Differentiation, integration, and Hadamard products (Section VI. 10). These 
are three operations on analytic functions that preserve the property for a 
function to be SA. Applications are given to tree recurrences and to multi- 
dimensional walk problems. 


A main theme of this book is that elementary combinatorial classes tend to have 
generating functions whose singularity structure is strongly constrained—in most cases, 
singularities are isolated. The singularity analysis process is then a prime technique 
for extracting asymptotic information from such generating functions. 


VI.7. Inverse functions 


Recursively defined structures lead to functional equations whose solutions may 
often be analysed locally near singularities. An important case is the one of func- 
tions defined by inversion. It includes the Cayley tree function as well as all generat- 
ing functions associated to simple varieties of trees (Subsections I. 5.1 (p. 65), I. 5.1 
(p. 126), and HI. 6.2 (p. 193)). A common pattern in this context is the appearance 
of singularities of the square-root type, which proves to be universal among a broad 
class of problems involving trees and tree-like structures. Accordingly, by singular- 
ity analysis, the square-root singularity induces subexponential factors of the asymp- 
totic form n~>/? in expansions of coefficients—we shall further develop this theme in 
Chapter VII, pp. 452-493. 


Inverse functions. Singularities of functions defined by inversion have been lo- 
cated in Subsection IV. 7.1 (p. 275) and our treatment will proceed from there. The 
goal is to estimate the coefficients of a function defined implicitly by an equation of 
the form 


(34) y(z) = z(y(z)) or equivalently z= y@) ’ 

p(y@)) 
The problem of solving (34) is one of functional inversion: we have seen (Lem- 
mas IV.2 and IV.3, pp. 275-277) that an analytic function admits locally an analytic 
inverse if and only if its first derivative is non-zero. We operate here under the follow- 
ing assumptions: 


Condition (Hj). The function ¢(w) is analytic at uv = 0 and satisfies 


(35) $0) 40, W"]du) 20, ou) FA dot du. 
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(As a consequence, the inversion problem is well defined around 0. The 
nonlinearity of ¢ only excludes the case d(u) = do + du, corresponding 
to y(z) = ¢0z/(1 — ¢12).) 

Condition (Hz). Within the open disc of convergence of ¢ at 0, |z| < R, 
there exists a (then necessarily unique) positive solution to the characteristic 
equation: 


(36) Ar,0<1t<R, o(t)—1d'(t) =0. 
(Existence is granted as soon as lim xd’ (x)/d(x) > 1 asx > R7,with R 
the radius of convergence of ¢ at 0; see Proposition IV.5, p. 278.) 


Then (by Proposition IV.5, p. 278), the radius of convergence of y(z) is the corres- 
ponding positive value p of z such that y(p) = 1, that is to say, 


T 1 


~ b(t) f(a) 


We start with a calculation indicating in a plain context the occurrence of a square-root 
singularity. 


(37) p 


Example V1.8. A simple analysis of the Cayley tree function. The situation corresponding 
to the function (wu) = e“, so that y(z) = ze”) (defining the Cayley tree function T(z), is 
typical of general analytic inversion. From (36), the radius of convergence of y(z) is p = eo! 
corresponding to t = 1. The image of a circle in the y—plane, centred at the origin and having 
radius r < 1, by the function ye” is a curve of the z—plane that properly contains the circle 
|z| =re—" (see Figure VI.9) as 6(y) = e”, which has non-negative coefficients, satisfies 


lscre'®)| <¢(r)  forall@ e[—z, +z], 


the inequality being strict for all 9 4 0. The following observation is the key to analytic 
continuation: Since the first derivative of y/P(y) vanishes at 1, the mapping y — y/d(y) 
is angle-doubling, so that the image of the circle of radius | is a curve C that has a cusp at 
p =e7!. (See Figure VI.9; Notes VI.18 and 19 provide interesting generalizations.) 

This geometry indicates that the solution of z = ye” is uniquely defined for z inside C, 
so that y(z) is A—analytic (see the proof of Theorem VI.6 below). A singular expansion for 
y(z) is then derived from reversion of the power series expansion of z = ye~. We have 


= 


1 1 e 
(38) yee S| 1) dO Is 
2e 3e 


4 
 O- ptt 


Observe both the absence of a linear term and the presence of a quadratic term (boxed). Then, 
solving z = ye” for y gives 


y-1l=V2(1 —ez)!/2 4 tl — ez) + O((1 — ez)*/”), 


where the square root arises precisely from inversion of the quadratic term. (A full expansion 
can furthermore be obtained.) .......... 0. ccc ccc eee ene cece een enn n eee teens || 
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Figure VI.9. The images of concentric circles by the mapping y  z = ye”. It is 
seen that y > z = ye” is injective on |y| < 1 with an image extending beyond the 
circle |z| = e—! [in grey], so that the inverse function y(z) is analytically continuable 
in a A-domain around z = e~!. Since the direct mapping ye” is quadratic at 1 
(with value el, see (38)), the inverse function has a square-root singularity at e! 
(with value 1). 


Analysis of inverse functions. The calculation of Example VI.8 now needs to 
be extended to the general case, y = zf(y). This involves three steps: (i) all the 
dominant singularities are to be located; (ii) analyticity of y(z) in a A-domain must 
be established; (iii) the singular expansion, obtained formally so far and involving a 
square-root singularity, needs to be determined. Step (i) requires a special discussion 
and is related to periodicities. 

A basic example like ¢(u) = 1 + u? (binary trees), for which 


1— V1 — 422 


shows that y(z) may have several dominant singularities—here, two conjugate singu- 
larities at -5 and +5. The conditions for this to happen are related to our discussion 
of periodicities in Definition IV.5, p. 266. As a consequence of this definition, ¢(u), 
which satisfies (0) 4 0, is p—periodic if (u) = g(u”) for some power series g (see 
p. 266) and p > 2; it is aperiodic otherwise. An elementary argument developed in 
Note VI.17, p. 407, shows that the aperiodicity assumption entails no loss of analytic 
generality (periodicity does not occur for y(z) unless ¢(u) is itself periodic, a case 


which, in addition, turns out to be reducible to the aperiodic situation). 


Theorem VI.6 (Singular Inversion). Let ¢ be a nonlinear function satisfying the con- 
ditions (Hy) and (H2) of Equations (35) and (36), and let y(z) be the solution of 
y = z$(y) satisfying y(0) = 0. Then, the quantity p = t/P(t) is the radius of con- 
vergence of y(z) at 0 (with t the root of the characteristic equation), and the singular 
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expansion of y(z) near p is of the form 


y@)=t—diVl—z/p+ Di-Iidj—z/p?, d= \ ae 


= ” 
re) P(t 


with the d; being some computable constants. 
Assume that, in addition, ¢ is aperiodic’. Then, one has 


; g(t) p” — ek 
cia bs 69 tia $0 £ (1454), 


n 
k=1 


for a family ex of computable constants. 


Proof. Proposition IV.5, p. 278, shows that p is indeed the radius of convergence 
of y(z). The Singular Inversion Lemma (Lemma IV.3, p. 277) also shows that y(z) 
can be continued to a neighbourhood of p slit along the ray R;,. 

The singular expansion at p is determined as in Example VI.8. Indeed, the rela- 
tion between z and y, in the vicinity of (z, y) = (p, tT), may be put under the form 


(39) p-z=H(y), where H(y) := (35 = i) : 


the function H(y) in the right-hand side being such that H(t) = H’(r) = 0. Thus, 
the dependency between y and z is locally a quadratic one: 


1 1 

p—z2= A" G)y— 1) + A" Ot) ++. 

When this relation is locally inverted: a square-root appears: 

HA" (tc) 
2 


The determination with a — / should be chosen there as y(z) increases to t~ as z > 


-J/p-Z= @=2)[l+a0-1+ea0-77+..]. 


p_. This implies, by solving with respect to y — 7, the relation 


y-t~—-di(e— 2)? +G(p-2)-G(p-2zy? t+, 
where dt = ,/2/H’(t) with H(t) = rh’ (t)/#(t). The singular expansion at p 
results. 

It now remains to exclude the possibility for y(z) to have singularities other than 
p on the circle |z| = p, in the aperiodic case. Observe that y(p) is well defined (in 
fact y(p) = T), so that the series representing y(z) converges at p as well as on the 
whole circle (given positivity of the coefficients). If é(z) is aperiodic, then so is y(z). 
Consider any point ¢ such that |¢| = p and¢ # p and set 7 = y(C). We then have 
|7| < t (by the Daffodil Lemma: Lemma IV.1, p. 266). The function y(z) is analytic 


TIE ¢ has maximal period p, then one must restrict n ton = 1 mod p; in that case, there is an extra 
factor of p in the estimate of y,: see Note VI.17 and Equation (40). 
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Type P(u) singular expansion of y(z) | coefficient [z"]y(z) 
n 
binary (1+u)* PA ht ois + O(n~9/?) 
Vine 
> 1 gntl/2 5/2 
unary-binary | lt+u+ue 1—3,/4—z+ + O(n ~/*) 
. Wand 
i 1 qn} 5/2 
general (1 —u)~ Fal xz + O(n-”/*) 
2 4 rn 
n 
Cayley et 1—V2eVe—!—z+.--. + O(n7>/?) 
V2nn3 


Figure VI.10. Singularity analysis of some simple varieties of trees. 


at ¢ by virtue of the Analytic Inversion Lemma (Lemma IV.2, p. 275) and the property 
that 


(This last property is derived from the fact that the numerator of the quantity on the 
left, 


b(n) — n¢'(n) = b0 — dan? — 263n° — 3dan* —---, 
cannot vanish, by the triangle inequality since |4| < t.) Thus, under the aperiodicity 


assumption, y(z) is analytic on the circle |z| = p punctured at p. The expansion of 
the coefficients then results from basic singularity analysis. | 


Figure VI.10 provides a table of the most basic varieties of simple trees and the 
corresponding asymptotic estimates. With Theorem VI.6, we now have available a 
powerful method that permits us to analyse not only implicitly defined functions but 
also expressions built upon them. This fact will be put to good use in Chapter VII, 
when analysing a number of parameters associated to simple varieties of trees. 

D> VI.14. All kinds of graphs. In relation with the classes of graphs listed in Figure II.14, 
p. 134, one has the following correspondence between an EGF f(z) and the asymptotic form 


of n![z"] f(z): 


. T-T?/2 1 1 1 
‘tion: ] 
ee . ee 1-T J1—-T (—7Tym 
1 
coefficient: el/2yn-2 5 Venn Ne Cnt Cont tr-D/2 


(m € Z31; Cj, C2 represent computable constants). In this way, the estimates of Subsec- 
tion II. 5.3, p. 132, are justifiable by singularity analysis. dq 


> VI.15. Computability of singular expansions. Define 


[t/(t) — w/p(w) 
h(w) := = eae. Sy , 
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so that y(z) satisfies ./p — z = (t—y)h(y). The singular expansion of y can then be deduced by 
Lagrange inversion from the expansion of the negative powers of h(w) at w = t. This technique 
yields for instance explicit forms for coefficients in the singular expansion of y = ze”. <i 


> VI.16. Stirling’s formula via singularity analysis. The solution to T = zer analytic at 0 is 


the Cayley tree function. It satisfies [z”] = n”—!/n! (by Lagrange inversion) and, at the same 
time, its singularity is known from Theorem VI.6 and Example VI.8. As a consequence: 


ee e” (: Cee See yeee ) 
n! lan nd 12n  288n2 51840 n3 
Thus Stirling’s formula also results from singularity analysis. dq 


> VIL.17. Periodicities. Assume that (uv) = w(u?)with y analytic at 0 and p > 2. Let y = 
y(z) be the root of y = z(y). Set Z = z? and let Y(Z) be the root of Y = Zw(Y)?. One has 
by construction y(z) = Y(z?)!/P, given that y? = zP(y)?. Since Y(Z) = Y;Z+Y¥oZ74---, 
we verify that the non-zero coefficients of y(z) are among those of index 1,1+ p,1+2p,.... 

If p is chosen maximal, then y(u)? is aperiodic. Then Theorem VI.6 applies to Y(Z): 
the function Y(Z) is analytically continuable beyond its dominant singularity at Z = p?; it 
has a square root singularity at p? and no other singularity on |Z| = p?. Furthermore, since 
Y = Zy(Y)?, the function Y(Z) cannot vanish on |Z| < p?, Z # 0. Thus, Y(Z)!/? is 
analytic in |Z| < p?, except at p? where it has a J branch point. All computations done, we 
find that 


d =H. 
(40) e"b@ ~ p> - 


when n=1 (mod p). 


an? 


The argument also shows that y(z) has p conjugate roots on its circle of convergence. (This is 
a kind of Perron—Frobenius property for periodic tree functions.) 


> VI.18. Boundary cases I. The case when 7 lies on the boundary of the disc of convergence 


of ¢ may lead to asymptotic estimates differing from the usual p7'n73/ Z prototype. Without 
loss of generality, take ¢ aperiodic to have radius of convergence equal to 1 and assume that ¢ 
is of the form 


(41) o(u) =u+c(1—u)* + 0( —u)*), with 1 <a <2, 


as u tends to | within |u| < 1. (Thus, continuation of ¢(u) beyond |u| < 1 is not assumed.) 
The solution of the characteristic equation $(r) — t6’(r) = 0 is then t = 1. The function y(z) 
defined by y = zf(y) is A—analytic (by a mapping argument similar to the one exemplified by 
Figure VI.9 and related to the fact that ¢ “multiplies” angles near 1). The singular expansion of 
y(z) and the coefficients then satisfy 


n-ia-l 
—I(-I/a) 


[The case a = 2 was first observed by Janson [350]. Trees with a € (1,2) have been investi- 
gated in connection with stable Lévy processes [180]. The singular exponent a = 3/2 occurs 
for instance in planar maps (Subsection VII. 8.2, p. 513), so that GFs with coefficients of the 


form pon! 3 would arise, if considering trees whose nodes are themselves maps.] J 


C2) y@)=1-e V4 -n'/4# +0(a-n'/*) > yw le 


> VI.19. Boundary cases II. Let ¢(u) be the probability generating function of a random vari- 


able X with mean equal to 1 and such that ¢, ~ An —4-1 with 1 <a <2. Then, by a complex 
version of an Abelian theorem (see, e.g., [69, §1.7] and [232]), the singular expansion (41) holds 
when u — 1, |u| < 1, within a cone, so that the conclusions of (42) hold in that case. Similarly, 
if 6” (1) exists, meaning that X has a second moment, then the estimate (42) holds with a = 2, 
and then coincides with what Theorem VI.6 predicts [350]. (In probabilistic terms, the condi- 
tion of Theorem VL.6 is equivalent to postulating the existence of exponential moments for the 
one-generation offspring distribution.) dq 
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VI.8. Polylogarithms 


Generating functions involving sequences such as (./n) or (log) can be sub- 
jected to singularity analysis. The starting point is the definition of the generalized 
polylogarithm, commonly denoted® by Lig,r, where a is an arbitrary complex number 
and r a non-negative integer: 


n 
Lig,-(z) = DYilogny’ =, 
n>1 
The series converges for |z| < 1, so that the function Li,,, is a priori analytic in 
the unit disc. The quantity Li; o(z) is the usual logarithm, log(1 — z)~!, hence the 
established name, polylogarithm, assigned to these functions [406]. In what follows, 
we make use of the abbreviation Li, o(z) = Li,(z), so that Lij(z) = Liio(z) = 
log(1—z)7! is the GF of the sequence (1/7). Similarly, Lio, is the GF of the sequence 
(log n) and Li_1/2(z) is the GF of the sequence (,/n). 

Polylogarithms are continuable to the whole of the complex plane slit along the 
ray Rs, a fact established early in the twentieth century by Ford [268], which results 
from the integral representation (48), p. 409. They are amenable to singularity analy- 
sis [223] and their singular expansions involve the Riemann zeta function defined by 


— 1 
C(s) = yy Pe 
n=1 
for R(s) > 1, and by analytic continuation elsewhere [578]. 
Theorem VI.7 (Singularities of polylogarithms). For alla € Zandr € Zso, the 
function Lig,,(z) is analytic in the slit plane C \ Rj. Fora ¢ {1,2,...}, there exists 
an infinite singular expansion (with logarithmic terms when r > 0) given by the two 
rules: 


— Jj . of ee e 
Lig) ~ Td -a)w*! sap oe mG — jyw!, wi= DI “ °° 
(43) : j20 é=1 
lin, @ = Cy —1i,@ ¢=0. 
Car 


The expansion of Li,, is conveniently described by the composition of two expansions 
(Figure VI.11, p. 410): the expansion of w = logz at z = 1, namely, w = (1 —z)+ 
5(1 —z)* +---, is to be substituted inside the formal power series involving powers 
of w. The exponents of (1 — z) involved in the resulting expansion are {a—1,a,...}U 
{0, 1,...}. For a < 1, the main asymptotic term of Li,,, is, as z > 1, 


Lig) ~PU =a 9 LEY, — L@) = los 


8The notation Lig (z) is nowadays well established. It is evocative of the fact that polylogarithms of 
integer order m > 2 are expressible by a logarithmic integral: 


=| m—1 1 dt 
Lim o(x) = a log(1 — xt) log’”~? + — 
: (m—1)! Jo t 


(not to be confused with the unrelated “logarithmic integral function” li(z) := i a: see [3, p. 228]). 
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while, fora > 1, we have Li,,,(z) ~ (-)’c(a), since the sum defining Lig, 
converges at 1. 


Proof. The analysis crucially relies on the Mellin transform (see Appendix B.7: 
Mellin transforms, p. 762). We start with the case r = 0 and consider several ways 
in which z may approach the singularity 1. Step (7) below describes the main ingre- 
dient needed in obtaining the expansion, the subsequent steps being only required for 
justifying it in larger regions of the complex plane. 


(i) When z > 1~ along the real line. Set w = — log z and introduce 


—nw 
e 


(44) A(w) := Lig (e~”) = > 


n>1 


n@ 


This is a harmonic sum in the sense of Mellin transform theory, so that the Mellin 
transform of A satisfies (R(s) > max(0, 1 — a)) 


(45) A*(s)= ye A(w)w’! dw = C(s + a)T(s). 


The function A(w) can be recovered from the inverse Mellin integral, 


1 ct+tioo 

(46) A(w) = =z / C(s +a)P(s)w™ ds, 
21% Jc—ico 

with c taken in the half-plane in which A*(s) is defined. There are poles at 5 = 

0, —1, —2,... due to the Gamma factor and a pole at s = | — a@ due to the zeta 

function. Take d to be of the form —m — 5 and smaller than 1 — a. Then, a standard 

residue calculation, taking into account poles to the left of c and based on 


A(w) = ~ Res (¢(s + a)P'(s)w~) 
(47) so€{0,—1,...,—m}U{1—a} 


1 d+ioco 
= | C(s +a)P(s)w™ ds, 
21% Jd-ico 

then yields a finite form of the estimate (43) of Lig (as w — 0, corresponding to 
zo). 


s=SQ 


(ii) When z > 17 inacone of angle less than x inside the unit disc. In that case, 
we observe that the identity in (46) remains valid by analytic continuation, since the 
integral there is still convergent (this property owes to the fast decay of I'(s) towards 
t+tioo). Then the residue calculation (47), on which the expansion of A(w) is based in 
the real case w > 0, still makes sense. The extension of the asymptotic expansion of 
Li,, within the unit disc is thus granted. 


(iii) When z tends to I vertically. Details of the proof are given in [223]. What 
is needed is a justification of the validity of expansion (43), when z is allowed to tend 
to 1 from the exterior of the unit disc. The key to the analysis is a Lindelof integral 
representation of the polylogarithm (Notes IV.8 and IV.9, p. 237), which provides 
analytic continuation; namely, 


1 1/2+i00 zs 1 


(48) Lig (-—z) = — 


2in 1/2—ico s® sinzs 
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- n _ __ Ve 3/0 1 12 

: a a 
Lio (z) rae = 1 
Lig @) = Solognz" = "P=? _ lie 4 2o! biog V2 + 0 - DLO) 
mf n>1 I-z 2 2 

; = Ze 1 1 1 
Liij2@) = 2 Ee = yieg g ee (a-2*”) 

. = logn L(z) — y —2log2 lyy «2 
Li21@) = 2 Gee = Vt oS - 65) (4+ 4 + log Ver) ++ 
fie: 2 =° 24 
n>1 fe 

Li oe = seu Ms. LoVe 

in (z) ~ 22 = eo (z) + 1)( a rar (z))Q -—z)+--- 


Figure VI.11. Sample expansions of polylogarithms (L(z) := log(1 — z)~!). 


The proof then proceeds with the analysis of the polylogarithm when z = e!’-”) and 
s = 1/2 + it, the integral (48) being estimated asymptotically as a harmonic integral 
(a continuous analogue of harmonic sums [614]) by means of Mellin transforms. The 
extension to a cone with vertex at 1, having a vertical symmetry and angle less than z, 
then follows by an analytic continuation argument. By unicity of asymptotic expan- 
sions (the horizontal cone of parts (7) and (ii) and the vertical cone have a non-empty 
intersection), the resulting expansion must coincide with the one calculated explicitly 
in part (i), above. 


To conclude, regarding the general case r > 0, we may proceed along similar 
lines, with each logn factor introducing a derivative of the Riemann zeta function, 
hence a multiple pole at s = 1. It can then be checked that the resulting expansion 
coincides with what is given by formally differentiating the expansion of Li, a number 
of times equal to r. (See also Note VI.20 below.) | 


Figure VI.11 provides a table of expansions relative to commonly encountered 
polylogarithms (the function Li is also known as a dilogarithm). Example V1.9 illus- 
trates the use of polylogarithms for establishing a class of asymptotic expansions of 
which Stirling’s formula appears as a special case. Further uses of Theorem VI.7 will 
appear in the following sections. 


Example V1.9. Stirling’s formula, polylogarithms, and superfactorials. One has 


> logn! 2" =(1- a7 Lig, 1 (z), 


n>1 
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to which singularity analysis is applicable. Theorem VI.7 then yields the singular expansion 
1, L(z) - 1-L(z)+y —1+log2z 
eye @)=% kz) ey Bene 
l-z , d—- z)2 2 1—z 


from which Stirling’s formula reads off: 


1 
logn! wnlogn =n logn + logy 2a tees. 


(Stirling’s constant log /2z comes out as neatly —¢’(0).) Similarly, define the superfactorial 
function to be 1!2?---n”. One has 


1 
> log(1727 veen™)z" = t= Li_;,1(z), 
n>1 me 


to which singularity analysis is mechanically applicable. The analogue of Stirling’s formula 
then reads: 


Ln 'Q2) , logQn)+y 
A = exp (a-¢ -n) = exp (- a) + 12 ). 


The constant A is known as the Glaisher—Kinkelin constant [211, p. 135]. Higher order factori- 
als can be treated similarly. 2.0.00... 0. cece cece eee cece n ne ne ene ene ene eee | 


> VI.20. Polylogarithms of integral index and a general formula. Let a = m € Z 3 . Then: 


CD" mt ol ae a 
Lim(z) = 7 —,w™ "(logo —Hm—-1) + Dem — fw! 
(m — 1)! Petts: ! 
j20,j#m—1 
where H,, is the harmonic number and w = —logz. [The line of proof is the same as in 


Theorem VI.7, only the residue calculation at s = 1 differs.] The general formula, 


or 
Lig,r (Z) = AG aut Res [¢(s +. a) (sw ], w := —logz, 
seZsqU{l—a} 


holds for all @ € C andr € Zy 9 and is amenable to symbolic manipulation. dq 


VI.9. Functional composition 


Let f and g be functions analytic at the origin that have non-negative coefficients. 
We consider the composition 


h=fog, h(z) = f(g(z)), 


assuming g(0) = 0. Let pr, pg, pn be the corresponding radii of convergence, and 
let tf = f (py), and so on. We shall assume that f and g are A—continuable and 
that they admit singular expansions in the scale of powers. There are three cases to be 
distinguished depending on the value of tz in comparison with pf. 


— Supercritical case, when tg > py. In that case, when z increases from 0, 
there is a value r strictly less than p, such that g(r) attains the value py, 
which triggers a singularity of f o g. In other words r = py = gD(py). 
Around this point, g is analytic and a singular expansion of f og is obtained 
by combining the singular expansion of f with the regular expansion of g 
at r. The singularity type is that of the external function (f). 


412 VI. SINGULARITY ANALYSIS OF GENERATING FUNCTIONS 


— Subcritical case, when tg < py. In this dual situation, the singularity of 
f og is driven by that of the inside function g. We have pp = pe, th = f (pg) 
and the singular expansion of f o g is obtained by combining the regular 
expansion of f with the singular expansion of g at pg. The singularity type 
is that of the internal function (g). 

— Critical case, when tg = py. In this boundary case, there is a confluence 
of singularities. We have ph = pg, t, = Tf, and the singular expansion 
is obtained by applying the composition rules of the singular expansions in- 
volved. The singularity type is a mix of the types of the internal and external 


functions (f, g). 


This classification extends the notion of a supercritical sequence schema in Section V. 2, 
p. 293, for which the external function reduces to f(z) = (1 — z)7!, with p f=iIn 
this chapter, we limit ourselves to discussing examples directly, based on the guide- 
lines above supplemented by the plain algebra of generalized power series expansions. 
Finer probabilistic properties of composition schemas are studied at several places in 
Chapter IX starting on p. 629. 


Example V1.10. “Supertrees”. Let G be the class of general Catalan trees: 


G=ZxS6yG) = GW) =501-VI-4). 


The radius of convergence of G(z) is 1/4 and the singular value is G(1/4) = 1/2. The class ZG 
consists of planted trees, which are such that to the root is attached a stem and an extra node, 
with OGF equal to zG(z). We then introduce two classes of supertrees defined by substitution: 


H = GLZG] => A(z) = GZG(z)) 
K=GU(Z4+2Z)G] = K(z) = G(2zG(z)). 


These are “trees of trees”: the class 7 is formed of trees such that, on each node there is grafted 
a planted tree (by the combinatorial substitution of Section I. 6, p. 83); the class K similarly 
corresponds to the case when the stems can be of any two colours. Incidentally, combinatorial 
sum expressions are available for the coefficients, 


y= 5 1 (E-2) (2-3-1). > 2k (2k —2) (2n— 3k — 1 
a n—-k\k—-1)\ n-k-1)) " n—-k\k—-1)J\ n-k-1)’ 
k=1 k=1 
the initial values being given by 
HQ) = 24343244729 42125 4-5, K(z) = 227 4223 +824 + 1825 + 64294... . 


Since pg = 1/4 and tg = 1/2, the composition scheme is subcritical in the case of 
and critical in the case of K. In the first case, the singularity is of square-root type and one finds 
easily: 


pt ae ae 4” 
H(z) ~ — —,/--Z%, —7 Ay, ~ —=>— > - 
zi 4 J/8V 4 " 8./Imn3/2 
In the second case, the two square-roots combine to produce a fourth root: 


ae Oe Oe 4n 
eee a ee 
2912 J2 \4 sr(3)nd/4 
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vaca 


Figure VI.12. A binary supertree is a “tree of trees”, with component trees all binary. 
The number of binary supertrees with 2n nodes has the unusual asymptotic form 
cA" 75/4, 


On a similar register, consider the class B of complete binary trees: 


1— V1 —422 
B=Z4+2Z2xBxB => BQ) = —,— ae 
& 


and define the class of binary supertrees (Figure VI.12) by 


1 2V1 — 472-1442 
1-/1=42 , 


The composition is critical since zB(z) = 5 at the dominant singularity z = 5: It is enough to 
consider the reduced function 


S(z) = S/2) =z 4+ 22 +323 + 824 + 252° + 802° + 26727 4.91128 +... , 


whose coefficients constitute EJS A101490 and occur in Bousquet-Mélou’s study of integrated 
superbrownian excursion [83]. We find 


2 = 4n /2 1 
oe —4z)'/44. (1-42) 1/2 4... = 7 hes 
S(z) ~ 1-V2(1—4z)'/44.1-4z) 2 4 > Sn = Soe (% Jani t ) 


For instance, a seven-term expansion yields a relative accuracy better than 10-4 for n > 100, 
so that such approximations are quite usable in practice. 


S=B(ZxB) = S(@Z= 


The occurrence of the exponent — 5 in the enumeration of bicoloured and binary supertrees 
is noteworthy. Related constructions have been considered by Kemp [364] who obtained more 
generally exponents of the form —1— 2-4 by iterating the substitution construction (in connec- 
tion with so-called “multidimensional trees’’). It is significant that asymptotic terms of the form 
nP/4 with q # 1,2 appear in elementary combinatorics, even in the context of simple algebraic 
functions. Such exponents tend to be associated with non-standard limit laws, akin to the stable 
distributions of probability theory: see our discussion in Section IX. 12, p. 715. ........... ‘| 


> VI.21. Supersupertrees. Define supersupertrees by 


SPl(z) = B(ZB(zB(2))). 
We find automatically (with the help of B. Salvy’s program) 


-1 
[2rt1ys(2I (2) ~ 2713/4 (Z) 4-9/8, 
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, . ‘ “ —d , 
and further extensions involving an asymptotic term n—!—2 “ are possible [364]. dq 


> VI.22. Valuated trees. Consider the family of (rooted) general plane trees, whose vertices 
are decorated by integers from Zs, (called “values”) and such that the values of two adjacent 
vertices differ by +1. Size is taken to be the number of edges. Let T; be the class of valuated 
trees whose root has value j and T = UT;. The OGFs 77 (z) satisfy the system of equations 


Tj =1+ 2(Tj-1 + Tj4)T;, 
so that T(z) solves T = 1 + 2z T? andisa simple variant of the Catalan OGF: 
1-/1-8z 
4z ‘ 
Bouttier, Di Francesco, and Guitter [90, 91] found an amazing explicit form for the Tj; namely, 


T(z) = 


1—Yit ya —yst> 1+y)4 
ces cass Cian ee 
(i — Yi+2)q1 — yi+4) 14+ Y2 
In particular, each 7; is an algebraic function. The function 7g counts maps (p. 513) that are 
Eulerian triangulations, or dually bipartite trivalent maps. The coefficients of the 7; as well as 


the distributions of labels in such trees can be analysed asymptotically: see Bousquet-Mélou’s 
article [83] for a rich set of combinatorial connections. 


Schemas. Singularity analysis also enables us to discuss at a fair level of general- 
ity the behaviour of schemas, in a way that parallels the discussion of the supercritical 
sequence schema, based on a meromorphic analysis (Section V.2, p. 293). We illus- 
trate this point here by means of the supercritical cycle schema. Deeper examples 
relative to recursively defined structures are developed in Chapter VII. 


Example V1.11. Supercritical cycle schema. The schema H = Cyc(G) forms labelled cycles 
from basic components in G: 


1 
H = Cyc(G) => H(z) = log =65" 

Consider the case where G attains the value 1 before becoming singular, that is, 7G > 
1. This corresponds to a supercritical composition schema, which can be discussed in a way 
that closely parallels the supercritical sequence schema (Section V. 2, p. 293): a logarithmic 
singularity replaces a polar singularity. 

Let o := py, which is determined by G(c) = 1. First, one finds: 


H(z)_y log ~ — los(o G'(c)) + A(z), 


1- 
where A(z) is analytic at z = o. Thus: 


o 


[2] H(z) ~ 


(The error term implicit in this estimate is exponentially small). 

The BGF A(z, u) = log(l — uG(z))7! has the variable u marking the number of com- 
ponents in 7{-objects. In particular, the mean number of components in a random +{—object of 
size n is ~ An, where 4 = 1/(oG'(c)), and the distribution is concentrated around its mean. 
Similarly, the mean number of components with size k in a random 7, object is found to be 
asymptotic to Agra*, where gx = [z*1G(z). silence geemcibsrea le diols MW see eels Mua ee Add | 
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Weights 

1 1 2k ; 

(49) fe | «le } Hy k k 

u ! 1 1 1 ; 422 

F() | log es es log 755 o> st 
Triangular arrays 

(k) =| yack k k (2n—-k-1 ky 2 n—-k-1 
Pe ee 
gi) | ee ce? el +2) InvI=tz 1-2 lime Te) 


Figure VI.13. Typical weights (top) and triangular arrays (bottom) illustrating the 


discussion of combinatorial sums Sy, = Daan fee : 


Combinatorial sums. Singularity analysis permits us to discuss the asymptotic 
behaviour of entire classes of combinatorial sums at a fair level of generality, with 
asymptotic estimates coming out rather automatically. We examine here combinatorial 


sums of the form 
n 
Sc ee 
k=0 


where f; is a sequence of numbers, usually of a simple form and called the weights, 
while the eg are a triangular array of numbers, for instance Pascal’s triangle. 

As weights f; we shall consider sequences such that f(z) is A—analytic with a 
singular expansion involving functions of the standard scale of Theorems VI.1, VI.2, 
VI.3. Typical examples? for f(z) and (fx) are displayed in Figure VI.13, Equa- 
tion (49). The triangular arrays discussed here are taken to be coefficients of the 


powers of some fixed function, namely, 
CO 
Bo =[z"1(g(@))* where —g(z) = > gn”, 
n=l 


with g(z) an analytic function at the origin having non-negative coefficients and sat- 
isfying g(0) = 0. Examples are given in Figure VI.13, Equation (50). An interesting 
class of such arrays arises from the Lagrange Inversion Theorem (p. 732). Indeed, if 
g(z) is implicitly defined by g(z) = zG(g(z)), one has gn. x = Tw""1G(w)"; the 
last three cases of (50) are obtained in this way (by taking G(w) as 1/(1 — w), 1 + 
w)*, e”), 


By design, the generating function of the S,, is simply 


[o.0) CO 
S@O=> 2" =fe®) with f@)=>o fz". 

n=0 k=0 
Consequently, the asymptotic analysis of S, results by inspection from the way singu- 
larities of f(z) and g(z) get transformed by composition. 


*Weights such as log k and Vk, also satisfy these conditions, as seen in Section VI. 8. 
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Example V1.12. Bernoulli sums. Let ¢ be a function from Zs to R and write fy := (k). 


Consider the sums 
n 
1 [n 
Sn i= 24 an({): 


If Xp is a binomial random variable!9, X, € Bin(n, bs then S, = E(¢(Xy)) is exactly the 
expectation of (X;,). Then, by the binomial theorem, the OGF of the sequence (S,,) is: 


S() = 2 Zz 
@=s1(55). 


Considering weights whose generating function, as in (49), has radius of convergence 1, what 
we have is a variant of the composition schema, with an additional prefactor. The composition 
scheme is of the supercritical type since the function g(z) = z/(2 — z), which has radius of 
convergence equal to 2, satisfies tg = oo. The singularities of S(z) are then of the same type 
as those of the weight generating function f(z) and one verifies, in all cases of (49), that, to 
first asymptotic order, S, ~ $(n/2): this is in agreement with the fact that the binomial distri- 
bution is concentrated near its mean n/2. Singularity analysis furthermore provides complete 
asymptotic expansions; for instance, 


(1 2 2 6 4 
(= | x» > 9) = wae et Oe ) 
E (Hy, ) ph OG) 
q a - n>). 
Xn ae ls ame De 
See [208, 223] for more along these lines. ........... 0... e cece cece ence eee eee ne enee | 


Example V1.13. Generalized Knuth-Ramanujan Q-functions. For reasons motivated by anal- 
ysis of algorithms, Knuth has encountered repeatedly sums of the form 


n n(n — 1) n(n — 1)(n — 2) 
On) = 70+ fo 2a SB 3 eee’ 
n n 

(See, e.g., [384, pp. 305-307].) There (f,) is a sequence of coefficients (usually of at most 
polynomial growth). For instance, the case f, = 1 yields the expected time until the first 
collision in the birthday paradox problem (Section II. 3, p. 114). 

A closer examination shows that the analysis of such Q» is reducible to singularity analy- 
sis. Writing 


nt-k-1 


n!} 
On((fk)) = fo+ 2 fol 


reveals the closeness with the last column of (50). Indeed, setting 
Sk 
F@=>) = 
=>) ee? 
k>1 


one has (n > 1) 


n! 


—le"1S(@z)_ where S(z) = F(T(2)), 


Qn = fot 


n@ 

and T (z) is the Cayley tree function (T = ze! ), 
For weights f, = ¢(k) of polynomial growth, the schema is critical. Then, the singular 
expansion of S$ is obtained by composing the singular expansion of f with the expansion of T, 
104 binomial random variable (p. 775) is a sum of Bernoulli variables: X;, = pa Y;, where the 
Y; are independent and distributed as a Bernoulli variable Y, with P(Y = 1) = p, P(Y =0) =q =1-p. 
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namely, T(z) ~ 1—V2./T — ez as z > e7!. For instance, if 6(k) = k" for some integer r > 1 
then F(z) has an rth order pole at z = 1. Then, the singularity type of F(T (z)) is Zl 
Z = (1 — ez), which is reflected by Sy = e"n'/2—! (we use ‘x’ to represent order-of-growth 
information, disregarding multiplicative constants). After the final normalization, we see that 
Qn X netD/2, Globally, for many weights of the form fx = $(k), we expect Qy to be of 
the form ./n¢(./n), in accordance with the fact that the expectation of the first collision in the 
birthday problem is on average near /7N/2. 0... cece ccc cee cece ncn e eect n ees fs 


where 


> VI.23. General Bernoulli sums. Let Xn, € Bin(n; p) be a binomial random variable with 
general parameters p, q: 


n =: 
P(Xn =k) = ({) ofa e q=l-p. 


Then with f; = (k), one has 


BK) =e" (72), 
—qz° \l-qz 


so that the analysis develops as in the case Bin(n; 5). dq 


> VI.24. Higher moments of the birthday problem. Take the model where there are n days 
in the year and let B be the random variable representing the first birthday collision. Then 


Py(B > k) =k\n-*(2), and 
En(®(B)) = (1) + On({A®(k)}), where A@(k) = O(k + 1) — O(K). 


For instance En (B) = 1+ Qn((1,1,...)). We thus get moments of various functionals (here 
stated to two asymptotic terms) 


D(x) x x2 4x x3 + x? x4 4x3 


BODY), | Se Ono: BE an Bn ee 


via singularity analysis. dq 


> VI.25. How to weigh an urn? The “shake-and-paint” algorithm. You are given an um 
containing an unknown number N of identical looking balls. How to estimate this number in 
much fewer than O(N) operations? A probabilistic solution due to Brassard and Bratley [92] 
uses a brush and some paint. Shake the urn, pull out a ball, then mark it with paint and replace 
it into the umn. Repeat until you find an already painted ball. Let X be the number of operations. 


One has E(X) ~ ./a N/2. Furthermore the quantity Y := X 2 /2 constitutes, by the previous 
note, an asymptotically unbiased estimator of N, in the sense that E(Y) ~ N. In other words, 
count the time till an already painted ball is first found, and return half of the square of this time. 


One also has /V(Y) ~ N. By performing the experiment m times (using m different colours 
of paint) and by taking the arithmetic average of the m estimates, one obtains an unbiased 


estimator whose typical relative accuracy is ,/1/m. For instance, m = 16 gives an accuracy 
of 25%. (Similar principles are used in the design of data mining algorithms.) dq 


> VI.26. Catalan sums. These are defined by 


= 2n ol 1-2z7-VJ1—4z 
i= DH,” ,)- = pags ( = ) 


k>0 


The case when p ¢ = | corresponds to a critical composition, which can be discussed much in 
the same way as Ramanujan sums. dq 
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VI. 10. Closure properties 


At this stage!!, we have available composition rules for singular expansions under 
operations such as +, x, +: these are induced by corresponding rules for extended 
formal power series, where generalized exponents and logarithmic factors are allowed. 
Also, from Section VI. 7, inversion of analytic functions normally gives rise to square- 
root singularities, and, from Section VI.9, functions amenable to singularity analysis 
are essentially closed under composition. 

In this section we show that functions amenable to singularity analysis (SA func- 
tions) satisfy explicit closure properties under differentiation, integration, and Hada- 
mard product. (The contents are liberally borrowed from an article of Fill, Flajolet, 
and Kapur [208], to which we refer for details.) In order to keep the developments 
simple, we shall mostly restrict attention to functions that are A—analytic and admit a 
simple singular expansion of the form 


J 


(51) f@= >i - 2% +01 -2)4), 


j=0 
or a simple singular expansion with logarithmic terms 


J 


1 
(52) f@)= Die L@) A-H" +O -2)4), — L@) = log —, 


j=0 


where each c; is a polynomial. These are the cases most frequently occurring in 
applications (the proof techniques are easily extended to more general situations). 

Subsection VI. 10.1 treats differentiation and integration; Subsection VI. 10.2 pre- 
sents the closure of functions that admit simple expansions under Hadamard prod- 
uct. Finally, Subsection VI. 10.3 concludes with an examination of several interesting 
classes of tree recurrences, where all the closure properties previously established are 
put to use in order to quantify precisely the asymptotic behaviour of recurrences that 
are attached to tree models. 


VI.10.1. Differentiation and integration. Functions that are SA happen to be 
closed under differentiation, this is in sharp contrast with real analysis. In the sim- 
ple cases!* of (51) and (52), closure under integration is also granted. The general 
principle (Theorems VI.8 and VI.9 below) is the following: Derivatives and primi- 
tives of functions that are amenable to singularity analysis admit singular expansions 
obtained term by term, via formal differentiation and integration. 

The following statement is a version, tuned to our needs, of well-known differ- 
entiability properties of complex asymptotic expansions (see, e.g., Olver’s book [465, 


p. 9]). 


'I This section represents supplementary material not needed elsewhere in the book, so that it may be 
omitted on first reading. 

!21t is possible but unwieldy to treat a larger class, which then needs to include arbitrarily nested 
logarithms, since, for instance, f dx/x = logx, [ dx/(x log x) = log log x, and so on. 
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radius: 
x |l -—2z| 


Figure VI.14. The geometry of the contour y (z) used in the proof of the differenti- 
ation theorem. 


Theorem VI.8 (Singular differentiation). Let f(z) be A-analytic with a singular 
expansion near its singularity of the simple form 


J 


f@= >i cA -2) +O -2"%). 


j=0 


Then, for each integer r > O, the derivative f(z) is A-analytic. The expansion of 
the derivative at the singularity is obtained through term-by-term differentiation: 


d" == r J . P@j+)) aj-r A-F 
Gi@=CN Lite +1-n~® +0(1-2)4~). 


Proof. All that is required is to establish the effect of differentiation on error terms, 
which is expressed symbolically as 


£0(1 -2)4) = 0-94). 
z 


By bootstrapping, only the case of a single differentiation (r = 1) needs to be consid- 
ered. 

Let g(z) be a function that is regular in a domain A(¢, 7) where it is assumed to 
satisfy g(z) = O((1 — z)4) for z € A. Choose a subdomain A’ := A(¢’, 7’), where 
d <¢' < 5 and0 <7! < yn. By elementary geometry, for a sufficiently small « > 0, 
the disc of radius x|z—1| centred at a value z € A’ lies entirely in A; see Figure VI.14. 
We fix such a small value x and let y (z) represent the boundary of that disc oriented 
positively. 

The starting point is Cauchy’s integral formula 


1 dw 
53 ‘Z) = — | w)———; , 
(53) swO=7, ooh ase? 
a direct consequence of the residue theorem. Here C should encircle z while lying 
inside the domain of regularity of g, and we opt for the choice C = y (z). Then trivial 


420 VI. SINGULARITY ANALYSIS OF GENERATING FUNCTIONS 


bounds applied to (53) give 
Is’) 


O (ly @I- — 2411-21?) 
0 (u-24-') 
The estimate involves the length of the contour, ||y (z)|], which is O(1 — z) by con- 


struction, as well as the bound on g itself, which is O((1 — z)4) since all points of the 
contour are themselves at a distance exactly of the order of |1 — z| from 1. a 


> VI.27. Differentiation and logarithms. Let g(z) satisfy 


1 
gz) =O (a =p) L(@)*) ; L(z) = log : 


’ 


for k € Zs. Then, one has 


d" A-r k 
Fr8@) = 0(0-2)4 7 LOM). 
(The proof is similar to that of Theorem VI.8.) J 
It is well known that integration of asymptotic expansions is usually easier than 
differentiation. Here is a statement custom-tailored to our needs. 
Theorem VI.9 (Singular integration). Let f(z) be A—analytic and admit an expansion 
near its singularity of the form 
J 
f@=>icjA-2) +O -2)%). 
j=0 
Then is f (¢) dt is A-analytic. Assume further that none of the quantities a; and A 
equal —1. 
(i) If A < -1, then the singular expansion of [ f is 


J 
z ee 
54 t)dt = — A - 24! +0 (Ud -2)**"). 
6) [ro Sa (a-2 
(ii) If A > —1, then the singular expansion of | f is 
J 


z 7 
6 ff p@ar=-Y “La - ot +1940(a-0""), 
joo 


where the “integration constant” Lo has the value 


aj<—l 


Proof. The basic technique consists in integrating term by term the singular expansion 
of f. We let r(z) be the remainder term in the expansion of f, that is, 


J 


r(z):= f(z) - >a —z)%. 


j=0 
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Figure VI.15. The contour used in the proof of the integration theorem. 


By assumption, throughout the A—domain one has, for some positive constant K, 
IP@I < KIL —zi4. 


(i) Case A < —1. Straight-line integration between 0 and z, provides (54), as 
soon as it has been established that 


[roa as) (1 —z|*+") 


By Cauchy’s integral formula, we can choose any path of integration that stays within 
the region of analyticity of r. We choose the contour y := y; U y2, shown in Fig- 


ure VI.15. Then, one has 
i: r(t)dt / r(t)dt 
1 72 


[roa 
¥ 
x / lb—t|4|dt}+K | [tl —tl4] Ide 
val y2 


+ 


S 


lA 


= O(\1 —2zI4*), 


where the symbol |dt| designates the differential line-length element in the corres- 
ponding curvilinear integral. Both integrals are O(|1—z|4*'): for the integral along »1, 
this results from explicitly carrying out the integration; for the integral along y2, this 
results from the trivial bound O(||y2I|(1 — z)4). 

(ii) Case A > —1. We let f_(z) represent the “divergence part” of f that gives 


rise to non-integrability: 


f-@:= > ej 2). 


aj<—l 
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Then with the decomposition f = [f — f-]+ /f-, integrations can be performed 
separately. First, one finds 


z a Cj 


<-1 J aj<—l 


1— ajt+l A Cj : 
ee) py ajt+l 


Next, observe that the asymptotic condition guarantees the existence of i applied to 
[Lf — f-], so that 
z 1 z 
[ ro-rola= fp [ro- rojas [ [ro - fe) at 
The first of these two integrals is a constant that contributes to Lo. As to the second 
integral, term-by-term integration yields 


[ {Of a= = Oa ee es a + [ roa. 


a; 
aj>—l as 


The remainder integral is finite, given the growth condition on the remainder term, 
and, upon carrying out the integration along the rectilinear segment joining | to z, 
trivial bounds show that it is indeed O(|1 — z|4*!). | 


> VI.28. Logarithmic cases. The case in which either some a; or A is —1 is easily treated by 
the additional rules 


[a —t)-!dt =L@), [ou —1)~!)dt = OL). 
0 0 


that are consistent with elementary integration, and similar rules are easily derived for powers 
of logarithms. Furthermore, the corresponding O-transfers hold true. (The proofs are simple 
modifications of the one given above for the basic case.) 


VI. 10.2. Hadamard Products. The Hadamard product of two functions f(z) 
and g(z) analytic at the origin is defined as their term-by-term product, 


(56) f@)Ogs@) =>) frBnz", where f(@)=>0 faz", g@) => ganz". 
n>0 n>0 n>0 


As we are going to see, following Fill, Flajolet, and Kapur [208], functions amenable 
to singularity analysis are closed under Hadamard product. Establishing such a closure 
property requires methods for composing functions from the basic scale, namely (1 — 
z), as well as error terms of the form O((1 — z)4). We address each problem in turn. 


Theorem VI.10 (Hadamard Composition). When neither of a, b, a + b is an integer, 
the Hadamard product (1 — z)* © (1 — z)° has an infinite expansion, valid in a A- 
domain, with exponent scale {0,1,2,...}U{a+b+1l,a+b+2,...}; namely, 


py Ud — zk pil ae ie 
dao sears 
k>0 k>0 
where the coefficients 2 and u are given by 
yor) TOtatd) Cahoon) _TCa-b- DO +a i+ oy 
, Td+a@r(i+b) (-a-pk? ** T(-a)P(-b) 2Q+a+b)e * 
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Here x* is defined for k € Zs by xk x(x + 1)---@@+k—-1). 
Proof. The expansion around the origin, 
Co ee) Ga) 5 


a > ve see 
(57) (—-z)*=1+ cor 7 aE Pts 
gives through term-by-term multiplication 
(58) d=)? Od = 2) =shl=a,—bebz). 


Here 2 F| represents the classical hypergeometric function of Gauss (p. 751) defined 
by 

oBz , aat VAB+1) a 

y I! yy +1) 2! 

From their transformation theory (see for instance [604, Ch XIV] and Appendix B.4: 
Holonomic functions, p. 748, for proof techniques), hypergeometric functions can gen- 


erally be expanded in the vicinity of z = 1 by means of the z } 1 — z transformation. 
Instantiation of this transformation with y = | yields 


(59) 2Fila, By y;z]J=1+ 


Td-a-f) 
60 F; silyzj= Fila, p; ;1- 
(60) 2Fila, B; 1; z] Td-ard—f)” ila, B3a + B31 —2z] 
Tat+s-V -a-f+1 
4+ (1 = 2) 41 Fl — a, 1 — B32 -a — B31 21. 
P@)l(Z) 

The statement follows, upon appealing to the definition (59) of hypergeometric func- 
tions. | 


> VI.29. Special cases. The case where either a or b is an integer poses no difficulty, since, for 
m € Zs, the function (1 — z)” © g(z) is a polynomial, while (1 — z)~” © g(z) is reducible 
to a derivative of g, to which the Singular Differentiation Theorem (p. 419) can be applied. 
The case a+ b € Zneeds transformation formulae that extend (60): the principles (based 
on a Lindel6f integral representation, p. 237, and developed by Barnes) are described in [604, 
§14.53], and the formulae appear explicitly in [3, pp. 559-560]. dq 


> VI.30. Simple expansions with logarithmic terms. The technique of differentiation with 
respect to a parameter, 


[a-2*L@]oa-2? = -< [a-s*00-2°], 


makes it possible to derive explicit composition rules for expansions involving logarithmic 
terms. 

The way Hadamard products preserve A—analyticity and compose error terms in 
singular expansions is summarized by the next statement. 
Theorem VI.11 (Hadamard closure). (i) Assume that f(z) and g(z) are analytic in 
a A-domain, A(wo,). Then, the Hadamard product (f © g)(z) is analytic in a 
(possibly smaller) A—domain, A’. 

(ii) Assume further that 


f@) = O(—2)*) and g(z) = O(-2)”), ze A(yo, 7). 


Then the Hadamard product (f © g)(z) admits in A' an expansion given by the fol- 
lowing rules: 
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—TIfa+b+1 <0, then 


(f Og) = O(1- 2"). 
—TIfk <a+b+1<k-+1, for some integer k € Zs~1, then 


kyl . . 
(fF Os) =D — (Fg) DU -2/ +0 (ad -2tt). 
j-o 7 


—Ifa+b+1 is a non-negative integer, then (with L(z) = log(1 — z)7!) 


Fone => — vos Ma-9/ +0(a-9%"'L@). 


jo J 


Proof. (Sketch) The starting point is an important formula due to Hadamard that 
expresses Hadamard products as a contour integral: 


1 d 
61) fos =s— | fee (2) &. 
a 


The contour y in the w-plane should be chosen such that both factors, f(w) and 
g(z/w) are analytic. In other words, given the domain A in which both f and g are 
analytic, one should have y C AN (zA7!). 

In the first case (a + b+ 1 < 0), the precise geometry of a feasible contour y 
is described in [208], the principles being similar to those employed in the construc- 
tion of Hankel contours elsewhere in this chapter. The integral giving the value of the 
Hadamard product is finally estimated trivially, based on the order of growth assump- 
tions on f and g, as z — 1. This approach extends to the case a + b + 1 = 0, where 
a logarithmic factor comes in, 

For the remaining cases, the easy identity 


d 
It4{(fOgy= (0°) 0 (948), where Jsie, 
z 
reduces the analysis to the situation wherea + b+ 1 < 0. It suffices to differen- 
tiate sufficiently many times and finally integrate back, as permitted by the Singular 
Integration Theorem (p. 420). | 


Globally, Theorems VI.10 and VI.11 establish the closure under Hadamard prod- 
ucts of functions amenable to singularity analysis, which satisfy an expansion (51). In 
practice, in order to derive the singular expansion of a function at a singularity, one 
may conveniently appeal to the Zigzag Algorithm described in Figure VI.16, whose 
validity is ensured by the a priori knowledge of the existence of an expansion guaran- 
teed by Theorems VI.10 and VI.11. (The “zigzag” qualifier reflects the fact that the 
algorithm proceeds back and forth, by making a repeated use of the correspondences 
between coefficient asymptotics and singularity asymptotics.) A typical application 
of this algorithm appears in (64) and (65) below, in the context of Pélya’s drunkard 
problem. 
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Let f(z) and g(z) be A-analytic and admit simple singular expansions of the form (51) or (52). 
What is sought is the singular expansion of 


h(z) := f(z) © g(z). 


Step 1. Determine the asymptotic expansions f, = [z”] f(z) and gn = [z’"]g(z) induced by 
the singular expansions of f and g in accordance with the singularity analysis process. Given 
finite singular expansions of f and g, the order C of the error in the expansion of h is known a 
priori by Theorem VI.11. 

Step 2. Deduce from Step | an asymptotic expansion of hy, = [z”]h(z) by usual multiplication 
from the expansions of fy and gn. 


Step 3. Reconstruct by singularity analysis a function H(z) that is singular at 1 and is such that 
[2] (z) ~ [z"]h(@). 


This can be done by using the expansions of basic functions, as provided by Theorems VI.1 
and VI.2 in the reverse direction. By construction, H(z) is a sum of functions of the form 
(1 — z)* L(z)*, which are all singular at 1. 


Step 4. Output the singular expansion of f © g as 
h(z) = H(@) + P@)+0(d-2°), 


where P is a polynomial of degree 6, which is the largest integer < C. The polynomial P(z) 
is needed, since polynomials (and more generally functions analytic at 1) do not leave a trace 
in asymptotic expansions of coefficients. Since h(z) — H(z) is 6 times differentiable at 1, one 
must take 


2 (=I ; 
P@Q=>; ae (A(z) — H(z))z=1  — 2). 
j=0 7 


Figure VI.16. The Zigzag Algorithm for computing singular expansions of 
Hadamard products. 


Example V1.14. Péolya’s drunkard problem. (This example is taken from Fill et al. [208].) In 
the d-dimensional lattice Z@ of points with integer coordinates, the drunkard performs a random 
walk starting from the origin with steps in {—1, +14, each taken with equal likelihood. The 
probability that the drunkard is back at the origin after 2 steps is 


d 
(d) _ 1 f2n 
(62) dn =(r(7)) > 


since the walk is a product d independent one-dimensional walks. The probability that 27 is the 
epoch of the first return to the origin is the quantity Pp, which is determined implicitly by 


0° , 7 0° ‘ 
(63) 1-2) =a, 
n=l n=0 


as results from the decomposition of loops into primitive loops (see also Note 1.65, p. 90). 
In terms of the associated ordinary generating functions P and Q, this relation reads as (1 — 
P(z))~! = Q(z), implying P(z) = 1 — 1/Q@). 

The asymptotic analysis of the gy is straightforward; that of the py is more involved and 
is of interest in connection with recurrence and transience of the random walk; see, e.g., [170, 
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403]. The Hadamard closure theorem provides a direct tool to solve this problem. Define 
1 (2n 1 
B(z) =Dalt)e == 
Then, Equations (62) and (63) entail 
1 
BOO’ 


The singularities of P(z) are found as follows. 


P(z)=1 where f(z) := B(z)@---© B(z) (d times). 


Case d = 1: No Hadamard product is involved and 


1 2n —2 1 
P(z)=1-V1-z, implying pw — sor o )e —_—- 
2 


no n22n-1\ n —1 iS. 
(This agrees with the classical combinatorial solution expressed in terms of Catalan numbers.) 
Case d = 2: By the Hadamard closure theorem, the function Q(z) = f(z) © f(z) admits 
a priori a singular expansion at z = | that is composed solely of elements of the form (1 — z)% 
possibly multiplied by integral powers of the logarithmic function L(z) = log(1/(1 — z)). From 


a computational standpoint (cf the Zigzag Algorithm), it is then best to start from the coefficients 
themselves, 


(64) ® (A ie 
a van 8Vxn3 m\n  4n? 


and reconstruct the only singular expansion that is compatible, namely 


(65) 0) = -L@+K + 0-9", 


where € > 0 is an arbitrarily small constant and K is fully determined as the limit as z > 
1 of Q(z) - x! L(z). Then it can be seen that the function P is A-continuable. (Proof: 
Otherwise, there would be complex poles arising from zeros of the function Q on the unit disc, 
and this would entail in p? the presence of terms oscillating around 0, a fact that contradicts 
the necessary positivity of probabilities.) The singular expansion of P(z) at z = 1 results 


immediately from that of Q(z): 

1 4 m-K + 
Lz) L(@)* 
so that, by Theorems VI.2 and VIL.3, one has 


P(z)~1- 


1 +aK 1 
pP = =~ L*8 4 0(—_) 
nlog-n nlog?n nlog*n 
= any -¥ 
K = 1 16-” —-— 


— 0.8825424006 106063735858257 . 
(See the study by Louchard et al. [422, Sec. 4] for somewhat similar calculations.) 


Case d = 3: This case is easy since Q(z) remains finite at its singularity z = 1 where it 
admits an expansion in powers of (1 — z)l/ 2 with the consequence that 


@) 1 1 2 i 1 3 
dn ~ Sy ns ~ = feed, 
lan &Jand ril2 \n3/2 Bn 3/2 
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The function Q(z) is a priori A—continuable and its singular expansion can be reconstructed 
from the form of coefficients: 


2 
Q(z) ~ O0)-=-Vv1l-z+ O(/1—2z)), 
zl 1 


leading to 
P(z) (: ) Be Wee Oa 
2) hee Pe VS —2|). 
Q(1) a Q(1)? 
By singularity analysis, the last expansion gives 
B) 1 1 1 
Pas m2Q(? nd? as (= 
Q(d) = Z = 1.3932039296856768591842463. 
3 
r (3) 


A complete asymptotic expansion in powers n~3/2,n-5/2, can be obtained by the same de- 
vices. In particular this improves the error term above to O(n! 2) The explicit form of Q(1) 
results from its expression as the generalized hypergeometric 3Fals, 5 53 1, 1; 1], which eval- 
uates by Clausen’s theorem and Kummer’s identity to the square of a complete elliptic integral. 
(See the papers by Larry Glasser for context, for instance [293]; nowadays, several computer 
algebra systems even provide this value automatically.) 

Higher dimensions are treated similarly, with logarithmic terms surfacing in asymptotic 
expansions for all even dimensions. ........... 0... e cece cece cence nett eet ee eens | 


VI. 10.3. Applications to tree recurrences. To conclude with singularity anal- 
ysis theory, we present the general framework of tree recurrences, also known as 
probabilistic divide-and-conquer recurrences, which are of the general form 


(66) fn =tn+ > Pnk(fe+ fr—a-k), = (2 = 00). 
k 


There, (f;,) is the sequence implicitly determined by the recurrence, assuming known 
initial conditions fo, ..., fno—1; the sequence (f,) is known as the sequence of tolls; 
the array (p,,x) is a triangular array of numbers that are probabilities in the sense that, 
for each fixed n > 0, one has : Pn,k = 1; the number a is a small fixed integer 
(usually 0 or 1). 

The interpretation of the recurrence is in the form of a splitting process: a col- 
lection of n elements is given; a number a of these is put aside and what remains is 
partitioned into two subgroups, a “left” subgroup of cardinality K,, and a “right” sub- 
group of cardinality n —a—K,. The quantity K,, is arandom variable with probability 
distribution 

P(Ky = k) = Pn,k- 
The splitting is repeated (recursively) till only groups of size less than the threshold 
ng are obtained. Assuming stochastic independence of all the random variables K 
involved, it is seen that f,, represents the expectation of the (total) cost C, of a random 
(recursive) splitting, when a single stage involving n elements incurs a toll equal to ty. 
In symbols: 


tn == (Cn), Ch = ta + Cx, +f Cn—a—Kn- 
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Clearly, a particular realization of the splitting process can be represented by a 
binary tree. With a suitable choice of probabilities, such processes can be used to anal- 
yse cost functional of increasing binary trees, and binary Catalan trees, for instance. 
A prime motivation is the analysis of divide-and-conquer algorithms in computer sci- 
ence, like quicksort, mergesort, union-find algorithms, and so on [132, 383, 384, 537, 
538, 598]. Our treatment once more follows the article [208]. 

A general approach to the asymptotic solution of a tree recurrence goes as fol- 
lows. First, introduce generating functions, 


F@D=D) fanz", 1] = tno", 


for some normalization sequences (@,) and (@},) that are problem-specific. (So, @, = 
1 gives rise to an OGF, @, = 1/n! to an EGF, with other normalizations being also 
useful.) Then, by linearity of the original recurrence, there exists a linear operator £ 
on series (and functions), such that 


F(Z) = Lt). 


Provided the splitting probabilities p, ,; have expressions of a tractable form, it is rea- 
sonable to attempt expressing £ in terms of the usual operations of analysis. One may 
then investigate the way £ affects singularities and deduce the asymptotic form of the 
cost sequence (f,,) from the singularities of its generating function, f(z). An inter- 
esting feature of this approach is to allow for a powerful discussion of the relationship 
between tolls and induced costs, in a way that parallels composition of singularities in 
Section VI.9. Closure properties discussed earlier in this section are a crucial ingre- 
dient in the intervening singularity analysis process. 

The three examples that we present combine closure properties with the singu- 
larity analysis of polylogarithms of Section VI.8. Example VI.15 is relative to in- 
creasing binary trees (defined in Example II.17, p. 143), which model binary search 
trees of computer science. Example VI.16 discusses additive costs of random binary 
Catalan trees in the perspective of tree recurrences. Finally, Example VI.17 shows the 
applicability of singularity analysis to a basic coalescence—fragmentation process. 


Example V1.15. The binary search tree recurrence. One of the simplest random tree models 
is defined as follows: a random binary tree of size n > 1 is obtained by taking a root and 
appending to it a left subtree of size Ky and a right subtree of size n — 1 — Kn, where Ky 
is uniformly distributed over the set of permissible values {0,1,..., — 1}. (Trees under this 
model are equivalent to increasing binary trees encountered in Example I1.17, p. 143, and to 
binary search trees of Note III.33, p. 203.) In the notations of (66), this process corresponds to 


1 
Png =P(Kn=H=-, Ok <n-1. 
n 


The associated tree recurrence is then 


n—-1 


2 
=f, = > = 10, 
tn at Dat fo = to 
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which translates for OGFs, 
{OD ies 20S Dae, 
n>0 n>0 
into a linear integral equation: 


dw 


l-w 


z 

(67) fe =1@) +2 f° fw) 
0 

Differentiation yields the ordinary differential equation 


y) 
f'@=t(@)+ Taf» FO) =%, 


which is then solved by the variation-of-constants method. In this way, it is found that an 
integral transform expresses the relation between the GF of tolls and the GF of total costs. 
Assuming without loss of generality tf) = 0, we have (with 0) = ) 


(68) f(@) = LE], where {L[t(z)] = — ie (Owt(w)) A - w)* dw. 


First, simple toll sequences that admit generating functions of a simple form can be em- 
ployed to build a repertoire! that already provides useful indications on the relations between 
the orders of growth of (f,) and (f;,). For instance, we find, for the rising-factorial tolls 


ee) P@=d=y 1, 
Gen a id na Is eet OD ~@_a-l[(nt+a\ 
postin taw, a el) 


for a 4 1, while a = 1 corresponding to a =n-+ 1 leads to 


f@= log fi =2 + 1)(An41 1) = 2nlogn + O(n), 


2 

Gage Tz: 
with Hy, a harmonic number. The emergence of an extra logarithmic factor for a = 1 is to 
be noted: it corresponds to the fact that path length in an increasing binary tree of size n is 
~ 2nlogn. Such elementary techniques provide the top two entries of Figure VI.17. 

Singularity analysis furthermore permits us to develop a complete asymptotic expansion 
for tolls of the form ./n, logn, and many others. Consider for instance the toll 12 = n®, for 
which the generating function t(z) is recognized to be a polylogarithm. From Theorem VI.7 
(p. 408), the function ¢(z) admits a singular expansions in terms of elements of the form (1 =z) : 
with the main term corresponding to 8 = —a — 1 whena > —1. The £ transformation of (68) 
reads as a succession of operations, differentiate, multiply by (1 — z)2, integrate, multiply by 
ad - z)-? ”, which are covered by Theorems VI.8 and VI.9. Consequently, the chain on any 
particular element starts as 


_.\2 
Ha DE: Ls eit TD aia ge. 


At this stage, integration intervenes: according to Theorem VI.9, assuming £ 4 —2 and ignor- 
ing integration constants, we find 


cpa obt! ~ogigl-2? se ~eg ta. 


3The repertoire approach is developed in an attractive manner by Greene and Knuth in [310]. 
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Tolls (ty) costs (fn) 
, n+a ( 1) a-l[f(n+a 4 atl n®@ 
= > = A ee 
me a a atl a 7 a—-1T(@+1) 
n+a l-a-1l n+a l+a 
th = 1 ——— 1- ie 
SR age PE ag 
1 
th = n® (2 <a) r= arn! + O(n?!) 
a- 
1 
i Ae (l<a <2) fn = on + 000) 
th = n* (0 <a <1) Kgn+ O(n") 
tn = logn Kon — logn + O(1) 


Figure VI.17. Tolls and costs for the binary search tree recurrence, with fg = 0. 


Thus, the singular element (1 — zh corresponds to a contribution 


which is of order O(n~? —!). This chain of operations suffices to determine the leading order 
of fy when ty =n% anda > 1. 

The derivation above is representative of the main lines of the analysis, but it has left aside 
the determination of integration constants, which play a dominant réle when t, = n® anda < 1 
(because a term of the form K /(1 — z)2 then dominates in f(z)). Introduce, in accordance with 
the statement of the Singular Integration Theorem (Theorem VL9, p. 420) the quantity 


} 
K[t] := [ ou - w)* - (wy - w)_| dw, 


where f_ represents the sum of singular terms of exponent < —1 in the singular expansion of 
f(z). Then, for t, = n®* with 0 < a < 1, taking into account the integration constant (which 
gets multiplied by (1 — z)~2, given the shape of £), we find for a <1: 


n@ 


Co 
n~ Kan, Kg =K{Li-g] =2 > ——~_.. 
BORG coe LU GEDerD 


Similarly, the toll tp = logn gives rise to 


[o.@) 
logn 
~ Kon,  Kj=2 >, ————— = 1.2035649167. 
tn Ko D GED ED 
This last estimate quantifies the entropy of the distribution of binary search trees, which is stud- 
ied by Fill in [207], and discussed in the reference book by Cover and Thomas on information 


theory:[134 3p 74216 |e 2 steak te ee brates ee eee hea RAEN cae es ete AA Se eet d | 


Example V1.16. = The binary tree recurrence. Consider a procedure that, given a (pruned) 
binary tree, performs certain calculations (without affecting the tree itself) at a cost of t,, for 
size n, then recursively calls itself on the left and right subtrees. If the binary tree to which the 
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Tolls (ty) costs (fn) 
Eos 
n@ (3 <a) ; Om + O(n") 
n3/2 2 + O(nlogn) 
al 
Ess) 
a I 3 EMA 27 at1/2 
n (3 <a <5) Ta) n + O(n) 
1 
ni/2 Roe O(n) 
va 


n® = O<a<35) K,n+0(\) 


logn Kon + 0(/n 


Figure VI.18. Tolls and costs for the binary tree recurrence. 


procedure is applied is drawn uniformly among all binary trees of size n the expectation of the 
total cost of the procedure satisfies the recurrence 


La ont 6 1 (2n 
—~1-k , 
(69) Sn = ta + 'F —eEee Sk + fn—k) with Cy = —— . 
Ch n+1\n 
k=0 
Indeed, the quantity 
CeCn—1-k 
Cn 
represents the probability that a random tree of size n has a left subtree of size k and a right 
subtree of size n — k. It is then natural to introduce the generating functions 


t(zZ)= ys thCnz", f= > fnCnz", 


n>=0 n>0 


Pnyk = 


and the recurrence (69) translates into a linear equation: 


f(z) = t(z) + 22C(z) f), 


with C(z) the OGF of Catalan numbers. Now, given a toll sequence (t,) with ordinary genera- 


tion function 
t(z) = >) nz", 


n>0 


the function f(z) is a Hadamard product: t(z) = t(z) © C(z). Furthermore, C(z) is well known, 
so that the fundamental relation is 


= _ TZ) OC) _l-Vvi- 4 
(70) f(z) = L[r(z)I, where £[r(z)] = > fae C(z) = x 


This transform relates the ordinary generating function of tolls to the normalized generating 
function of the total costs via a Hadamard product. 
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Tolls (tn) costs (fn) 
T(a- uy 
a “ee 2? nttl/2 4 O(y9-1/2 
n 5 ) Var (a) n (n ) 
3/2 ee 2 
n —n*~ + O(nlogn) 
1 
T(a- 1) 
1 3 2 1/2 
n@& (3 <a<35) Vita) noth 4 O(n) 
ni/2 : nlogn + O(n) 
1 


n@  O<a<5) K,n+0() 
logn Kin+ O(Vn) 


Figure VI.19. Tolls and costs for the Cayley tree recurrence. 


The calculation for simple tolls like n” with r € Zs can be carried out elementarily. For 
the tolls 4” = n® what is required is the singular expansion of 


rooe({) =t-aoe (=> (0) (i 


This is precisely covered by Theorems VI.7 (p. 408), VI.10 (p. 422), and VI.11 (p. 423). The 
results of Figure VI.18 follow, after routine calculations. .............. 0... c cece eee eee |_| 


Example V1.17. The Cayley tree recurrence. Consider n vertices labelled 1,...,. There are 
(n — 1)!n"— sequences of edges, 


(U1, 015 )s (U2, 025 )s+++ 5 Up—1,Un—1)s 
that give rise to a tree over {1,...,}, and the number of such sequences is (n — 1)!n"~? since 
there are n”—? unrooted trees of size n. At each stage k, the edges numbered 1 to & determine 


a forest. Each addition of an edge connects two trees [that then become rooted] and reduces the 
number of trees in the forest by 1, so that the forest evolves from the totally disconnected graph 
(at time 0) to an unrooted tree (at time n — 1). If we consider each of the sequences to be equally 
likely, the probability that u,_ 1 and v,_1 belong to components of size k and (n — k) is 


l n Kel 4 kt} 
2nm—H\k os) 


(The reason is that there are k*—! rooted trees of size k; the last added edge has n—1 possibilities 
and 2 possible orientations.) 

Assume that the aggregation of two trees into a tree of size equal to ¢ incurs a toll of te. 
The total cost of the aggregation process for a final tree of size n satisfies the recurrence 


7 1 n Klin is kek! 
EIN Shi = TNE nn—2 


(1) fa =tnt+ >) Pane fe + fn—k)> 


O<k<n 


The recurrence (71) has been studied in detail by Knuth and Pittel [383], building upon an earlier 
analysis of Knuth and Schénhage [384]. A prime motivation of the cited works is the emergence 
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of this recurrence in the study algorithms that dynamically manage equivalence relations (the 
so-called union-find algorithm [384]). 
Given the sequence of tolls (t,), we introduce the generating function 


t(z) = >) nz", 


n>1 


and let T be the Cayley tree function (T = ze!). For total costs, the generating function 
adopted is 
IOS > fee: 
n>1 
The basic recurrence (71) can then be rephrased as a linear ordinary differential equation, which 
is solved by the variation-of-constant method. This gives rise to an integral transform involving 
a Hadamard product, namely, 
T(z) dw 
1 — T(z) Jo T(w) 
Though the expression of the transform looks formidable at first sight, it is really nothing but a 
short sequence of basic operations, “Hadamard product, multiplication, differentiation, division, 
integration, multiplication”, each of which has a quantifiable effect on functions of singularity 
analysis class. (The singularity structure of T(z) is itself determined by the Singular Inversion 
Theorem, Theorem VI.6, p. 404.) 

The net result is that the effect of tolls of the form n%, login, and so on, can be analysed: 
see Figure VI.19 for a listing of estimates. Details of the proof are left as an exercise to our 
reader and are otherwise found in [208, §5.3]. The analogy of behaviour with the Catalan 
tree recurrence stands out. This example is also of interest since it furnishes an analytically 
tractable model of a coalescence-fragmentation process, which is of great interest in several 
areas of science, for which we refer to Aldous’ survey [9]. .......... 0. 0c eee e eee cence eee | 


(72) f(z) = L[r(z)], with L[7r](z) = ; : Ow (« (w) © T(w)’) 


VI.11. Tauberian theory and Darboux’s method 


There are several alternative approaches to the analysis of coefficients of func- 
tions that are of moderate growth. Naturally, all such methods must provide estimates 
compatible with singularity analysis theory (Theorems VI.1, VI.2, and VI.3). Each 
one requires some sort of “regularity condition” either on the part of the function or 
on the part of the coefficient sequence, the regularity condition of singularity analysis 
being in essence analytic continuation. 

The methods briefly surveyed here fall into three broad categories: (i) Elementary 
real analytic methods; (ii) Tauberian theorems; (iii) Darboux’s method. 

Elementary real analytic methods assume some a priori smoothness conditions on 
the coefficient sequence; they are included here for the sake of completeness, though 
properly speaking they do not belong to the galaxy of complex asymptotic methods. 
Their scope is mostly limited to the analysis of products while the other methods 
permit one to approach more general functional composition patterns. Tauberian the- 
orems belong to the category of advanced real analysis methods; they also need some 
a priori regularity on the coefficients, typically positivity or monotonicity. Darboux’s 
method requires some smoothness of the function on the closed unit disc, and, by its 
techniques and scope, it is the closest to singularity analysis. 
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We content ourselves with a brief discussion of the main results. For more infor- 
mation, the reader is referred to Odlyzko’s excellent survey [461]. 


Elementary real analytic methods. An asymptotic equivalent of the coefficients 
of a function can sometimes be worked out elementarily from simple properties of the 
component functions. The regularity conditions are a smooth asymptotic behaviour of 
the coefficients of one of the two factors in a product of generating functions. A prime 
source for these techniques is Bender’s survey [36]. 


Theorem VI.12 (Real analysis asymptotics). Let a(z) = >) anz” and b(z) = > byz" 
be two power series with radii of convergence a > fh > 0, respectively. Assume that 
b(z) satisfies the ratio test, 


Dn-1 
bn 
Then the coefficients of the product f(z) = a(z) - b(z) satisfy, provided a(P) 4 0: 
[z"1 f(z) ~ a(B)bn as n> ow. 
Proof. (Sketch) The basis of the proof is the following chain: 
fn = agbn + a1 bn—1 + arbn—2 + +++ + Anbo) 


Dn bn—2 bo 
= bn (a +a Fr, + a2 7 +otane) 


Dn-\ bn-2 bn-1 
= b ee 
r(oora(S) +a) +) 
~ Ddn(ao + aiB + a2h? +--+). 


There, only the last line requires a little elementary analysis that is left as an exercise 
to the reader (see Pélya—Szegé [492], Problem 178, Part I, Volume I). | 


This theorem applies for instance to the EGF of 2—-regular graphs: 


> fp as n— ©. 


f@ =a(2)-b(@)—o with az) =e Z/?-/4, az) = 


1 
af le 
for which it gives fy ~ e7>/4 a a *) ~ aoe in accordance with Example VI.2 
(p. 395). Clearly, a whole collection of lemmas can be stated in the same vein. Singu- 


larity analysis usually provides more complete expansions, although Theorem VI.12 
does apply to a few situations not covered by it. 


Tauberian theory. Tauberian methods apply to functions whose growth is only 
known along the positive real line. The regularity conditions are in the form of ad- 
ditional assumptions on the coefficients (positivity or monotonicity) known under the 
name of Tauberian “side conditions”. An insightful introduction to the subject may 
be found in Titchmarsh’s book [577], and a detailed exposition in Postnikov’s mono- 
graph [494] and Korevaar’s compendium [389]. We cite the most famous of all Taube- 
rian theorems due to Hardy, Littlewood, and Karamata. For the purpose of this sec- 
tion, a function is said to be slowly varying at infinity iff, for any c > 0, one has 
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A(cx)/A(x) > las x +00. (Examples of slowly varying functions are provided 
by powers of logarithms or iterated logarithms.) 

Theorem VI.13 (The HLK Tauberian theorem). Let f(z) be a power series with 
radius of convergence equal to 1, satisfying 

1 1 
(73) F)~ G=aF Se 
for some a > 0 with A a slowly varying function. Assume that the coefficients fn = 
"| f(z) are all non-negative (this is the “side condition”). Then 


(74) > fa a. 7 op. 


The conclusion (74) is consistent with the result given by singularity analysis: 
under the conditions, and if in addition analytic continuation is assumed, then 


al 


n 
(75) Sn ~ Ta) 


which by summation yields the estimate (74). 

It must be noted that a Tauberian theorem requires very little on the part of the 
function. However, it gives little, since it does not include error estimates. Also, the 
result it provides is valid in the more restrictive sense of mean values, or Cesaro aver- 
ages. (If further regularity conditions on the f,, are available, for instance monotonic- 
ity, then the conclusion of (75) can then be deduced from (74) by purely elementary 
real analysis.) The method applies only to functions that are large enough at their 
singularity (the assumption a > 0), and despite numerous efforts to improve the con- 
clusions, it is the case that Tauberian theorems do not have much to offer in terms of 
error estimates. 

Appeal to a Tauberian theorem may be justified when a function has, apart from 
the positive half line, a very irregular behaviour near its circle of convergence, for 
instance when each point of the unit circle is a singularity. (The function is then said 
to admit the unit circle as a natural boundary.) An interesting example of this situation 
is discussed by Greene and Knuth [309] who consider the function 


A(n), 


76 a 1.42 2 
(76) f@=T] ( a ) ; 
k=1 
which is the EGF of permutations having cycles all of different lengths. A little com- 
putation shows that 


of zk eo zk 1 _ 72k 1 oo zk 
1 1+—]}) = = Bree: 
~ log — - 7 +011) 


(Only the last line requires some care, see [309].) Thus, we have 


2 
/O~— = hee Wee” 
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by virtue of Theorem VI.12. In fact, Greene and Knuth were able to supplement this 
argument by a “bootstrapping” technique and show a stronger result, namely 


froe’. 
> VI.31. Fine asymptotics of the Greene-Knuth problem. With f(z) as in (76), we have 


_ -} 


e? e 
"f(z = oe? +—— + 7 logn—1—y + log2) 
1 1 
+5 [e? log?n ey logn +e +211)" +00] +0 (4), 
n n 


where c,,c2 are computable constants and Q(n) has period 3. (The paper [227] derives a 
complete expansion based on a combination of Darboux’s method and singularity analysis.) <J 
Darboux’s method. The method of Darboux (also known as the Darboux—Pélya 
method) requires, as regularity condition, that functions be sufficiently differentiable 
(“smooth”) on their circle of convergence. What lies at the heart of the method is a 
simple relation between the smoothness of a function and the decrease of its Taylor 
coefficients. 
Theorem VI.14 (Darboux’s method). Assume that f(z) is continuous in the closed 
disc |z| < 1 and is, in addition, k times continuously differentiable (k > 0) on |z| = 1. 
Then 


(77) ese =0(5). 


Proof. Start from Cauchy’s coefficient formula 


1 dz 
= Rae 1A ere 
f= 5 | $@ on 
Because of the continuity assumption, one may take as integration contour C the unit 
circle. Setting z = e’? yields the Fourier version of Cauchy’s coefficient formula, 


2n 
(78) n= 5m fee ye™ a, 


The integrand in (78) is strongly oscillating. The Riemann—Lebesgue lemma of clas- 
sical analysis [577, p. 403] shows that the integral tends to 0 as n — oo. 

The argument above covers the case k = 0. For a general k, successive integra- 
tions by parts give 

oe (k) _,i0 i0 
n = l —nAL do, 
RUO= sae f £ene 

a quantity that is o(n*), by Riemann—Lebesgue again. | 

Various consequences of Theorem VI.14 are given in reference texts also under 
the name of Darboux’s method. See for instance [129, 309, 329, 608]. We shall only 


illustrate the mechanism by rederiving in this framework the analysis of the EGF of 
2-regular graphs (Example VI.2, p. 395). We have 


—z/2—z22/4 e73/4 ar 
= te" V1—z4+ R(z). 
V1—z V1—z 


e 


(79) f@= 
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There R(z) is the product of (1 — z)?/* with a function analytic at z = 1 that is 


a remainder in the Taylor expansion of en 2/227 /4, Thus, R(z) is of class C!, ice., 
continuously differentiable once. By Theorem VI.14, we have 


[z"]R(z) =0 (*). 
n 
so that 


gs 1 
(80) asgy= — +0(2). 
Jan n 

Darboux’s method bears some resemblance to singularity analysis in that the es- 
timates are derived from translating error terms in expansions. However, smoothness 
conditions, rather than plain order of growth information, are required by it. The 
method is often applied, in situations similar to (79)-(80), to functions that are prod- 
ucts of the type h(z)(1—z)® with h(z) analytic at 1. In such particular cases, Darboux’s 
method is however subsumed by singularity analysis. 

It is inherent in Darboux’s method that it cannot be applied to functions whose 
singular expansion only involves terms that become infinite, while singularity analy- 
sis can. A clear example arises in the analysis of the common subexpression prob- 
lem [257] where there occurs a function with a singular expansion of the form 


1 1 Cl 
1+ pe es 
Va ose log jz 
> VI.32. Darboux versus singularity analysis. This note provides an instance where Darboux’s 
method applies whereas singularity analysis does not. Let 
CO 22" 
F(z) = Ds (2")\r* 
n=0 
The function F(z) is singular at every point of the unit circle, and the same property holds for 
any F; with r € Zo. [Hint: Fo, which satisfies the functional equation F(z) = z+ F (22), 
grows unboundedly near 2”th roots of unity.] Darboux’s method can be used to derive 


2") 1 Fs(z) Cc a 1 32 
= ——+o{-}, c= —. 
f V1—z a /TNn n 31 
What is the best error term that can be obtained? <i 


VI. 12. Perspective 


The method of singularity analysis expands our ability to extract coefficient asymp- 
totics to a far wider class of functions than the meromorphic and rational functions of 
Chapters [TV and V. This ability is the fundamental tool for analysing many of the 
generating functions provided by the symbolic method of Part A, and it is applicable 
at a considerable level of generality. 

The basic method is straightforward and appealing: we locate singularities, es- 
tablish analyticity in a domain around them, expand the functions around the singular- 
ities, and apply general transfer theorems to take each term in the function expansion 
to a term in the asymptotic expansion of its coefficients. The method applies directly 
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to a large variety of explicitly given functions, for instance combinations of ratio- 
nal functions, square roots, and logarithms, as well as to functions that are implicitly 
defined, like generating functions for tree structures, which are obtained by analytic 
inversion. Functions amenable to singularity analysis also enjoy rich closure prop- 
erties, and the corresponding operations mirror the natural operations on generating 
functions implied by the combinatorial constructions of Chapters I-III. 

This approach again sets us in the direction of the ideal situation of having a 
theory where combinatorial constructions and analytic methods fully correspond, but, 
again, the very essence of analytic combinatorics is that the theorems that provide 
asymptotic results cannot be so general as to be free of analytic side conditions. In 
the case of singularity analysis, these side conditions have to do with establishing an- 
alyticity in a domain around singularities. Such conditions are automatically satisfied 
by a large number of functions with moderate (at most polynomial) growth near their 
dominant singularities, justifying precisely what we need: the term-by-term transfer 
from the expansion of a generating function at its singularity to an asymptotic form of 
coefficients, including error terms. The calculations involved in singularity analysis 
are rather mechanical. (Salvy [528] has indeed succeeded in automating the analysis 
of a large class of generating functions in this way.) 

Again, we can look carefully at specific combinatorial constructions and then ap- 
ply singularity analysis to general abstract schemas, thereby solving whole classes of 
combinatorial problems at once. This process, along with several important examples, 
is the topic of Chapter VII, to come next. After that, we introduce, in Chapter VII, 
the saddle-point method, which is appropriate for functions without singularities at a 
finite distance (entire functions) as well as those whose growth is rapid (exponential) 
near their singularities. Singularity analysis will surface again in Chapter IX, given 
its crucial technical rdle in obtaining uniform expansions of multivariate generating 
functions near singularities. 


Bibliographic notes. Excellent surveys of asymptotic methods in enumeration have been given 
by Bender [36] and more recently Odlyzko [461]. A general reference to asymptotic analy- 
sis that has a remarkably concrete approach is De Bruijn’s book [143]. Comtet’s [129] and 
Wilf’s [608] books each devote a chapter to these questions. 

This chapter is largely based on the theory developed by Flajolet and Odlyzko in [248], 
where the term “singularity analysis” originates. An important early (and unduly neglected) 
reference is the study by Wong and Wyman [615]. The theory draws its inspiration from classi- 
cal analytic number theory, for instance the prime number theorem where similar contours are 
used (see the discussion in [248] for sources). Another area where Hankel contours are used 
is the inversion theory of integral transforms [168], in particular in the case of algebraic and 
logarithmic singularities. Closure properties developed here are from the articles [208, 223] by 
Flajolet, Fill, and Kapur. 

Darboux’s method can often be employed as an alternative to singularity analysis. Al- 
though it is still a widely used technique in the literature, the direct mapping of asymptotic scales 
afforded by singularity analysis appears to us to be much more transparent. Darboux’s method is 
well explained in the books by Comtet [129], Henrici [329], Olver [465], and Wilf [608]. Taube- 
rian theory is treated in detail in Postnikov’s monograph [494] and Korevaar’s encyclopaedic 
treatment [389], with an excellent introduction to be found in Titchmarsh’s book [577]. 
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Applications of Singularity Analysis 


Mathematics is being lazy. Mathematics is letting the principles do the work for you 
so that you do not have to do the work for yourself 7 


— GEORGE POLYA 


I wish to God these calculations had been executed by steam. 


— CHARLES BABBAGE (1792-1871) 
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Singularity analysis paves the way to the analysis of a large quantity of generating 
functions, as provided by the symbolic method expounded in Chapters I-III. In accor- 
dance with Poélya’s aphorism quoted above, it makes it possible to “be lazy” and “let 
the principles work for you”. In this chapter we illustrate this situation with numerous 
examples related to languages, permutations, trees, and graphs of various sorts. As in 
Chapter V, most analyses are organized into broad classes called schemas. 


First, we develop the general exp—log schema, which covers the set construc- 
tion, either labelled or unlabelled, applied to generators whose dominant singularity 
is of logarithmic type. This typically non-recursive schema parallels in generality 
the supercritical schema of Chapter V, which is relative to sequences. It permits us to 
quantify various constructions of permutations, derangements, 2—regular graphs, map- 
pings, and functional graphs, and provides information on factorization properties of 
polynomials over finite fields. 


Quoted in M Walter, T O’Brien, Memories of George Pélya, Mathematics Teaching 116 (1986) 
2“There is an imperishable tree, it is said, that has its roots upward and its branches down and whose 
leaves are the Hymns [Vedas]. He who knows it possesses knowledge.” 
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Next, we deal with recursively defined structures, whose study constitutes the 
main theme of this chapter. In that case, generating functions are accessible by means 
of equations or systems that implicitly define them. A distinctive feature of many 
such combinatorial types is that their generating functions have a square-root sin- 
gularity, that is, the singular exponent equals 1/2. As a consequence, the counting 
sequences characteristically involve asymptotic terms of the form A”n~>/*, where the 
latter asymptotic exponent, —3/2, precisely reflects the singular exponent 1/2 in the 
function’s singular expansion, in accordance with the general principles of singularity 
analysis presented in Chapter VI. 

Trees are the prototypical recursively defined combinatorial type. Square-root 
singularities automatically arise for all varieties of trees constrained by a finite set of 
allowed node degrees, including binary trees, unary—binary trees, ternary trees, and 
many more. The counting estimates involve the characteristic n~>/* subexponential 
factor, a property that holds in the labelled and unlabelled frameworks alike. 

Simple varieties of trees have many properties in common, beyond the subexpo- 
nential growth factor of tree counts. Indeed, in a random tree of some large size n, 
almost all nodes are found to be at level about ./n, path length grows on average like 
n/n, and height is of order ,/n, with high probability. These results serve to unify 
classical tree types—we say that such properties of random trees are universal? among 
all simply generated families sharing the square-root singularity property. (This notion 
of universality, borrowed from physics, is also nowadays finding increasing popularity 
among probabilists, for reasons much similar to ours.) In this perspective, the motiva- 
tion for organizing the theory along the lines of major schemas fits perfectly with the 
quest of universal laws in analytic combinatorics. 

In the context of simple varieties of trees, the square-root singularity arises from 
general properties of the inverse of an analytic function. Under suitable conditions, 
this characteristic feature can be extended to functions defined implicitly by a func- 
tional equation. Consequences are the general enumeration of non-plane unlabelled 
trees, including isomers of alkanes in theoretical chemistry, as well as secondary struc- 
tures of molecular biology. 

Much of this chapter is devoted to context-free specifications and languages. In 
that case, a priori, generating functions are algebraic functions, meaning that they sat- 
isfy a system of polynomial equations, itself optionally reducible (by elimination) to 
a single equation. For solutions of positive polynomial systems, square-root singular- 
ities are found to be the rule under a simple technical condition of irreducibility that is 
evocative of the Perron—Frobenius conditions encountered in Chapter V in relation to 
finite-state and transfer-matrix models. As an illustration, we show how to develop a 


3The following quotation illustrates well the notion of universality in physics: “[...] this echoes the 
notion of universality in statistical physics. Phenomena that appear at first to be unconnected, such as mag- 
netism and the phase changes of liquids and gases, share some identical features. This universal behaviour 
pays no heed to whether, say, the fluid is argon or carbon dioxide. All that matters are broad-brush charac- 
teristics such as whether the system is one-, two- or three-dimensional and whether its component elements 
interact via long- or short-range forces. Universality says that sometimes the details do not matter.” [From 
“Utopia Theory”, in Physics World, August 2003]. 
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coherent theory of topological configurations in the plane (trees, forests, graphs) that 
satisfy a non-crossing constraint. 


For arbitrary algebraic functions (the ones that are not necessarily associated with 
positive coefficients and equations, or irreducible positive systems), a richer set of sin- 
gular behaviours becomes possible: singular expansions involve fractional exponents 
(not just 1/2, corresponding to the square-root paradigm above). Singularity analysis 
is invariably applicable: algebraic functions are viewed as plane algebraic curves, and 
the famous Newton—Puiseux theorem of elementary algebraic geometry completely 
describes the types of singularities thay may occur. Algebraic functions also surface 
as solutions of various types of functional equations: this turns out to be the case for 
many classes of walks that generalize Dyck and Motzkin paths, via what is known 
as the kernel method, as well as for many types of planar maps (embedded planar 
graphs), via the so-called quadratic method. In all these cases, singular exponents of a 
predictable (rational) form are bound to occur, implying in turn numerous quantitative 
properties of random discrete structure and universality phenomena.. 


Differential equations and systems are associated to recursively defined structure, 
when either pointing constructions or order constraints appear. For counting generat- 
ing functions, the equations are nonlinear, while the GFs associated to additive param- 
eters lead to linear versions. Differential equations are also central in connection with 
the holonomic framework*, which intervenes in the enumeration of many classes of 
“hard” objects, like regular graphs and Latin rectangles. Singularity analysis is once 
more instrumental in working out precise asymptotic estimates—the appearance of 
singular exponents that are algebraic (rather than rational) numbers is a characteristic 
feature of many such estimates. We examine here applications relative to quadtrees 
and to varieties of increasing trees, some of which are closely related to permutations 
as well as to algorithms and data structures for sorting and searching. 


VII.1. A roadmap to singularity analysis asymptotics 


The singularity analysis theorems of Chapter VI, which may be coarsely summa- 
rized by the correspondence 


1 —n_a-l 


() Paya xz/p) 7 Jn 


serve as our main asymptotic engine throughout this chapter. Singularity analysis is 
instrumental in quantifying properties of non-recursive as well as recursive structures. 
Our reader might be surprised not to encounter integration contours anymore in this 
chapter. Indeed, it now suffices to work out the local analysis of functions at their 
singularities, then the general theorems of singularity analysis (Chapter VI) effect the 
translation to counting sequences and parameters automatically. 


“Holonomic functions (Appendix B.4: Holonomic functions, p. 748) are defined as solutions of linear 
differential equations with coefficients that are rational functions. 
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The exp-log schema. This schema, examined in Section VII. 2, is relative to the 
labelled set construction, 


(2) F = SETG) = F(z) = exp (G(z)), 


as well as its unlabelled counterparts, MSET and PSET: an F-structure is thus con- 
structed (non-recursively) as an unordered assembly of G—components. In the case 
where the GF of components is logarithmic at its dominant singularity, 


1 
(3) Gz) ~ x log ——_ +4, 
1—2z/p 


an immediate computation shows that F(z) has a singularity of the power type, 
EQ) ap). 

which is clearly in the range of singularity analysis. The construction (2), supple- 
mented by simple technical conditions surrounding (3), defines the exp—log schema. 
Then, for such F-structures that are assemblies of logarithmic components, the asymp- 
totic counting problem is systematically solvable (Theorem VII.1, p. 446): the number 
of G—components in a large random F-structure is O(log), both in the mean and in 
probability, while more refined estimates describe precisely the likely shape of pro- 
files. This schema has a generality comparable to the supercritical schema examined 
in Section V. 2, p. 293, but the probabilistic phenomena at stake appear to be in sharp 
contrast: the number of components is typically small, being logarithmic for exp—log 
sets, as opposed to a linear growth in the case of supercritical sequences. The schema 
can be used to analyse properties of permutations, functional graphs, mappings, and 
polynomial over finite fields. 


Recursion and the universality of square-root singularity. A major theme of 
this chapter is the study of asymptotic properties of recursive structures. In a large 
number of cases, functions with a square root singularity are encountered, and given 
the usual correspondence, 


1 
1/2 . 
HG) = =) eae fn ~ ; 
2Van3 
the corresponding coefficients are of the asymptotic form Cp—"n-/?. Several schemas 
can be described to capture this phenomenon; we develop here, in order of increas- 
ing structural complexity, the ones corresponding to simple varieties of trees, implicit 
structures, Polya operators, and irreducible polynomial systems. 


Simple varieties of trees and inverse functions. Our treatment of recursive com- 
binatorial types starts with simple varieties of trees, studied in Section VII. 3. In the 
basic situation, that of plane unlabelled trees, the equation is 


(4) Y = Z x SEQQ(Y) = Y(z) = z(Y (z)), 


with, as usual, 6(w) = >°,<9 w®. Thus, the OGF Y (z) is determined as the inverse 
of w/f(w), where the function ¢ reflects the collection of all allowed node degrees 
(Q). From analytic function theory, we know that singularities of the inverse of an 
analytic function are generically of the square-root type (Subsection IV. 7.1, p. 275 
and Section VI. 7, p. 402), and such is the case whenever Q is a “well-behaved” set 
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of integers, in particular, a finite set. Then, the number of trees invariably satisfies an 
estimate of the form 


(5) Yn = [z"]¥(z) ~ CA"n7 3/2, 


Square-root singularity is also attached to several universality phenomena, as evoked 
in the general introduction to this chapter. 


Tree-like structures and implicit functions. Functions defined implicitly by an 
equation of the form 


(6) Y(z) = G(z, Y(z)) 


where G is bivariate analytic, has non-negative coefficients, and satisfies a natural set 
of conditions also lead to square-root singularity (Section VII. 4 and Theorem VIL.3, 
p. 468)). The schema (6) obviously generalizes (4): simply take G(z, y) = z(y). 
Again, such functions invariably satisfy an estimate (5). 


Trees under symmetries and Polya operators. The analytic methods mentioned 
above can be further extended to Pélya operators, which translate unlabelled set and 
cycle constructions; see Section VII.5. A typical application is to the class of non- 
plane unlabelled trees whose OGF satisfies the infinite functional equation, 


2 
(= <e0( 424 AC ) +). 


1 2 


Singularity analysis applies more generally to varieties of non-plane unlabelled trees 
(Theorem VII.4, p. 479), which covers the enumeration of various types of interesting 
molecules in combinatorial chemistry. 


Context-free structures and polynomial systems. The generating function of any 
context-free class or language is known to be a component of a system of positive 
polynomial equations 


yl PGs 1s 0+ +5 Yr) 

Yr = Pr(Z,Y1,-++5 dr) 
The n~?/? counting law is once more universal among such combinatorial classes 
under a basic condition of “irreducibility” (Section VII. 6 and Theorem VII.5, p. 483). 
In that case, the GFs are algebraic functions satisfying a strong positivity constraint; 
the corresponding analytic statement constitutes the important Drmota—Lalley—Woods 
Theorem (Theorem VII.6, p. 489). 

Note that there is a progression in the complexity of the schemas leading to 
square-root singularity. From the analytic standpoint, this can be roughly rendered 
by a chain 

inverse functions —> implicit functions —> systems. 


It is, however, often meaningful to treat each combinatorial problem at its minimal 
level of generality, since expressions tend to become less and less explicit as complex- 
ity increases. 
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General algebraic functions. In essence, the coefficients of all algebraic func- 
tions can be analysed asymptotically (Section VII. 7). There are only minor limitations 
arising from the possible presence of several dominant singularities, like in the ratio- 
nal function case. The starting point is the characterization of the local behaviour of 
an algebraic function at any of its singularities, which is provided by the Newton— 
Puiseux theorem: if ¢ is a singularity, then the branch Y(z) of an algebraic function 
admits near ¢ a representation of the form 


(7) ¥QSZ" | Mae |)... Z2isd=2), 


k>0 


for some r/s € Q, so that the singular exponent is invariably a rational number. 
Singularity analysis is systematically applicable, so that the nth coefficient of Y is 
expressible as a finite linear combination of terms, each of the asymptotic form 


(8) rns, 5 EQ\(1,20,5: 


see also Figure VII.1. The various quantities (like ¢,r,s) entering the asymptotic 
expansion of the coefficients of an algebraic function turn out to be effectively com- 
putable. 

Beside providing a wide-encompassing conceptual framework of independent in- 
terest, the general theory of algebraic coefficient asymptotics is applicable whenever 
the combinatorial problems considered are not amenable to any of the special schemas 
previously described. For instance, certain kinds of supertrees (these are defined as 
trees composed with trees, Example VII.10, p. 412) lead to the singular type Z!/*, 
which is reflected by an unusual subexponential factor of n~>/* present in asymptotic 
counts. Maps, which are planar graphs drawn in the plane (or on the sphere), satisfy a 
universality law with a singular exponent equal to 3/2, which is associated to counting 
sequences involving an asymptotic n~>/* factor. 


Differential equations and systems. When recursion is combined with point- 
ing or with order constraints, enumeration problems translate into integro-differential 
equations. Section VII.9 examines the types of singularities that may occur in two 
important cases: (i) linear differential equations; (i7) nonlinear differential equations. 

Linear differential equations arise from the analysis of parameters of splitting 
processes that extend the framework of tree recurrences (Subsection VI. 10.3, p. 427), 
and we treat the geometric quadtree structure in this perspective. An especially notable 
source of linear differential equations is the class of holonomic functions (solutions of 
linear equations with rational coefficients, cf Appendix B.4: Holonomic functions, 
p. 748), which includes GFs of Latin rectangles, regular graphs, permutations con- 
strained by the length of their longest increasing subsequence, Young tableaux and 
many more structures of combinatorial theory. In an important case, that of a “regu- 
lar” singularity, asymptotic forms can be systematically extracted. The singularities 
that may occur extend the algebraic ones (7), and the corresponding coefficients are 
then asymptotically composed of elements of the form 


(9) c7"n® (log ny’, 
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Rational Irred. linear system ¢7” Perron—Frob., merom. fns, 
Ch. V 

= General rational C ag meromorphic functions, 
Ch. V 

Algebraic __Irred. positive sys. = ¢7” nol DLW Th., sing. analysis, 
this chapter, §VII. 6, p. 482 

— General algebraic 9 ¢~"n?P/4 Puiseux, sing. analysis, 
this chapter, §VII. 7, p. 493 

Holonomic Regular sing. (ae n? log® n ODE, sing. analysis, 


this chapter, §VII. 9.1, p. 518 
=n P(A!) 8 log’n ODE, saddle-point, 
§ VILL. 7, p. 581 


oo Irregular sing. C 


Figure VII.1. A telegraphic summary of a hierarchy of special functions by increas- 
ing level of generality: asymptotic elements composing coefficients and the coeffi- 
cient extraction method (with ?,r € Zso, p/q € Q ¢ and @ algebraic, and P a 
polynomial). 


(@ an algebraic quantity, £ € Zs), a type which is much more general than (8). 

Nonlinear differential equations are typically attached to the enumeration of trees 
satisfying various kinds of order constraints. A global treatment is intrinsically not 
possible, given the extreme diversity of singular expansions that may occur. Accord- 
ingly, we restrict attention to first-order nonlinear equations of the form 


£1) = 6), 
hi 


which covers varieties of increasing trees and certain urn processes, including several 
models closely related to permutations. 

Figure VII.1 summarizes three classes of special functions encountered in this 
book, namely, rational, algebraic, and holonomic. When structural complexity in- 
creases, a richer set of asymptotic coefficient behaviours becomes possible. (The com- 
plex asymptotic methods employed extend much beyond the range summarized in the 
figure. For instance, the class of irreducible positive systems of polynomial equations 
are part of the general square-root singularity paradigm, also encountered with Polya 
operators, as well as inverse and implicit functions in non-algebraic cases.) 


VII. 2. Sets and the exp—log schema 


We begin by examining a schema that is structurally comparable to the supercrit- 
ical sequence schema of Section V. 2, p. 293, but one that requires singularity analysis 
for coefficient extraction. The starting point is the construction of permutations (P) as 
labelled sets of cyclic permutations (KX): 


(10) P=SETK) => =P) =exp(K()), where K(z) = log -—— — 
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which gives rise to many easy explicit calculations. For instance, the probability that 
a random permutation consists of a unique cycle is 1/n (since it equals K,/P,); the 
number of cycles is asymptotic to logn, both on average (p. 122) and in probability 
(Example II.4, p. 160); the probability that a random permutation has no singleton 
cycle is ~ e~! (the derangement problem; see pp. 123 and 228). 

Similar properties hold true under surprisingly general conditions. We start with 
definitions that describe the combinatorial classes of interest. 


Definition VII.1. A function G(z) analytic at 0, having non-negative coefficients and 
finite radius of convergence p is said to be of (x, 4)-logarithmic type, where « # 0, if 
the following conditions hold: 


(i) the number p is the unique singularity of G(z) on |z| = p; 
(ii) G(z) is continuable to a A—domain at p; 
(iii) G(z) satisfies 


1 1 : 
(11) G(z) = wlog —— +14 0(7 asz— pind. 


Definition VII.2. The labelled construction F = SET(G) is said to be a labelled 
exp—log schema (“exponential—logarithmic schema”) if the exponential generating 
function G(z) of G is of logarithmic type. The unlabelled construction F = MSET(G) 
is said to be an unlabelled exp—log schema if the ordinary generating function G(z) 
of G is of logarithmic type, with p < 1. In each case, the quantities (x, 4) of (11) are 
referred to as the parameters of the schema. 

By the fact that G(z) has positive coefficients, we must have « > 0, while the sign 


of 4 is arbitrary. The definitions and the main properties to be derived for unlabelled 
multisets easily extend to the powerset construction: see Notes VII.1 and VII.5 below. 


Theorem VII.1 (Exp—log schema). Consider an exp—log schema with parameters 
(x, A). 


(i) The counting sequences satisfy 


[z"]G(z) = 2a (1 +O (og n)~)) , 
["F() = a n''p—" (1+ 0 (ogn))), 


where ro = 0 in the labelled case and rp = Da j>2 G(p!)/j in the case of unlabelled 
multisets. 
(ii) The number X of G-components in a random F—object satisfies 


tg(X) = K(logn— yl) +2411 +0 (logn)') (wis) = £1), 


where r; = 0 in the labelled case andr, = Dj>2 G(p/) in the case of unlabelled 
multisets. The variance satisfies Vr, (X) = O (log n), and, in particular, the distribu- 


tion® of X is concentrated around its mean. 


5 We shall see in Subsection IX.7.1 (p. 667) that, in addition, the asymptotic distribution of X is 
invariably Gaussian under such exp—log conditions. 
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Proof. This result is from an article by Flajolet and Soria [258], with a correction 
to the logarithmic type condition given by Jennie Hansen [318]. We first discuss the 
labelled case, F = SET(G), so that F(z) = exp G(z). 

(i) The estimate for [z” ]G(z) follows directly from singularity analysis with log- 
arithmic terms (Theorem V1.4, p. 393). Regarding F(z), we find, by exponentiation, 


e 1 
is = ——__ ]1+ 0 ( —______ } |. 
(12) F(z) =| Gren 


Like G, the function F = e© has an isolated singularity at p, and is continuable to 
the A—domain in which the expansion (11) is valid. The basic transfer theorem then 
provides the estimate of [z”] F(z). 

(ii) Regarding the number of components, the BGF of F with u marking the 
number of G-components is F(z, uw) = exp(uG(z)), in accordance with the general 
developments of Chapter III. The function 


fi@ := < Fe, u) = F(z)G(z), 


u=1 


is the EGF of the cumulated values of X. It satisfies near p 


e? 1 1 
= ] A 1+ 0 | ——-————~ 
DS aaeiae (« ioe )| (max) | 


whose translation, by singularity analysis theory is immediate: 


e” 


Two? (« logn-—Ky(k)+A4+0 (ogn)')) ; 


This provides the mean value estimate of X as [z”] fi (z)/[z"]F(z). The variance 
analysis is conducted in the same way, using a second derivative. 


[2"1filz) = Ex, (X) = 


For the unlabelled case, the analysis of [z”]G(z) can be recycled verbatim. First, 
given the assumptions, we must have p < 1 (since otherwise [z”]G(z) could not be 
an integer). The classical translation of multisets (Chapter I) rewrites as 


+> Ge) 
F(z) =exp(G() + RG), Re) = Do ——, 
jon 
where R(z) involves terms of the form G(z7), ..., each being analytic in |z| < p!/*. 
Thus, R(z) is itself analytic, as a uniformly convergent sum of analytic functions, in 
|z| < p!/?. (This follows the usual strategy for treating Pélya operators in asymptotic 
theory.) Consequently, F(z) is A—analytic. As z > p, we then find 


ef to 1 <, G(p/) 
ie eS if NS = ul. 
Co BO) car | us (cana) | = 2 


j=2 


The asymptotic expansion of [z”] F(z) then results from singularity analysis. 
The BGF F(z, u) of F, with uv marking the number of G—components, is 


UG) | u2G(z2) wo). 
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Fe x |n=100 n=272 n=739 
Permutations 1 | 5.18737 6.18485 7.18319 
Derangements | 1 | 4.19732 5.18852 6.18454 
2-regular 5 | 2.53439 3.03466 3.53440 
Mappings 5 | 2.97898 3.46320 3.95312 


Figure VII.2. Some exp—log structures (F) and the mean number of G—components 
for n = 100,272 = [100- e|, 739 = [100- e*|: the columns differ by about x, as 


expected. 


Consequently, 


AQ = <F@w) =FOGO+RO), RO = Gt). 
j=z 


u=1 


Again, the singularity type is that of F(z) multiplied by a logarithmic term, 


CO 
14 ~ F(z\(G =) Gp’). 
(14) Ai 5,FOCO+m, n= Low) 
j= 
The mean value estimate results. Variance analysis follows similarly. | 


> VIL.1. Unlabelled powersets. For the powerset construction F = PSET(G), the statement of 
Theorem VII.1 holds with : 
j-1 G(p!) 
ro = > (-1)/7 1, 


j22 
as seen by an easy adaptation of the proof technique of Theorem VII.1. dd 
As we see below, beyond permutations, mappings, unlabelled functional graphs, 
polynomials over finite fields, 2-regular graphs, and generalized derangements belong 
to the exp—log schema; see Figure VII.2 for representative numerical data. Further- 
more, singularity analysis gives precise information on the decomposition of large F 
objects into G components. 


Example V¥1.1. Cycles in derangements. The case of all permutations, 


1 
P(z) =exp(K()),—-K(@) = log ; 


is immediately seen to satisfy the conditions of Theorem VII.1: it corresponds to the radius of 
convergence p = | and parameters (x, 1) = (1, 0). 

Let Q be a finite set of integers and consider next the class D = D® of permutations 
without any cycle of length in Q. This includes standard derangements (where Q = {1}). The 
specification is then 

D SET(K) Pe) o 
G = Cycz,\o(2) —= G(z) 


ll 

ie) 

Pal 

so) 
~ 
mB 

N 
ww 
eS 


ll 

pe 

) 
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The theorem applies, with « = 1,4 :=—-— i ,<9 @” lin particular, the mean number of cycles 
in a random generalized derangement of size n is logn + O(1). ......... eee eee eee ee eee | 


Example V1.2... Connected components in 2—regular graphs. The class of (undirected) 2— 
regular graphs is obtained by the set construction applied to components that are themselves 
undirected cycles of length > 3 (see p. 133 and Example VI.2, p. 395). In that case: 


F = SET(Gg) F(Z) = exp(G(z)) 
= 1 1 ae 
G = U€yYCs5(Z) GQ) = yy lee a a 
This is an exp—log scheme with x = 1/2 and 2 = —3/4. In particular the number of compo- 
nents is asymptotic to 5 logn, both in the mean and in probability. ...................... | 


Example V11.3. Connected components in mappings. The class ¥ of mappings (functions 
from a finite set to itself) was introduced in Subsection II. 5.2, p. 129. The associated digraphs 
are described as labelled sets of connected components (K), themselves (directed) cycles of 
trees (J), so that the class of all mappings has an EGF given by 


F(z) =exp(K(z)), — -K(z) = log T(z) = ze! ©, 


1-— T(z)’ 
with T the Cayley tree function. The analysis of inverse functions (Section VI.7 and Exam- 
ple VIL8, p. 403) has shown that T(z) is singular at z = e—!, where it admits the singular 
expansion T(z) ~ 1 — /2,/1 — ez. Thus G(z) is logarithmic with « = 1/2 and A = —log V2. 
As a consequence, the number of connected mappings satisfies 


Kn = nllz"]K(z) = wl (1 - o(n-"/?)) 


In other words: the probability for a random mapping of size n to consist of a single component 


is~ ,/ in Also, the mean number of components in a random mapping of size n is 


1 is 
5 logn + log V2e7 + on "/?). 


Similar properties hold for mappings without fixed points, which are analogous to derangements 
and were discussed in Chapter II, p. 130. We shall establish below, p. 480, that unlabelled 
functional graphs also belong to the exp—log schema. .............. 0. eee e cece eee e eens | 


Example V1.4. Factors of polynomials over finite fields. Factorization properties of ran- 
dom polynomials over finite fields are of importance in various areas of mathematics and have 
applications to coding theory, symbolic computation, and cryptography [51, 599, 541]. Exam- 
ple 1.20, p. 90, offers a preliminary discussion. 

Let Fp be the finite field with p elements and P C Fp[X] the set of monic polynomials 
with coefficients in the field. We view these polynomials as (unlabelled) combinatorial objects 
with size identified to degree. Since a polynomial is specified by the sequence of its coefficients, 
one has, with A the “alphabet” of coefficients, A = Fp treated as a collection of atomic objects: 


(15) P = SEQ(A) = P() = 


1— pz , 
On the other hand, the unique factorization property of polynomials entails that the class Z of all 
monic irreducible polynomials and the class P of all polynomials are related by P = MSET(Z). 
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(K +1) (X10 4 x9 4 x8 4 x64 x44. x3 41) (xI 4 xt 4 x10 4 x3 41) 
3 (K+) (x2 4X41 2 (ylT yg x64 IS 4 x X94 X64 24K GI 
X5(K $1) (x5 +3 + x7 4x 41) (X12 4x8 4x74 x6 4 x5 4x3 4x72 4X41) (X27 4X41) 
x2 (x2 4x41)" x3 +x?41) X84 x74 x4 x44x2 4x41 (x8 + x7 4x5 + x4 41) 
(x7 + x64 x9 4x3 4x2 4x41) (x18 + x17 4 x13 4 x9 4 x8 4 x7 4 x64 x4 41) 


Figure VII.3. The factorizations of five random polynomials of degree 25 over F. 
One out of five polynomials in this sample has no root in the base field (the asymptotic 
probability is 4 by Note VIL4). 


As a consequence of Mobius inversion, one then gets (Equation (94) of Chapter I, p. 91): 


SFR], RO = eas BNO pe 2 


(16) I(zZ)= log 5 eG c 
k>2 ~ Pe 


Regarding complex asymptotics, the function R(z) of (16) is analytic in |z| < p7—!/2. 


Thus / (z) is of logarithmic type with radius of convergence 1/p and parameters 


k 1 
x=1, pa i 


As already noted in Chapter I, a consequence is the asymptotic estimate I, ~ p”/n, which 
constitutes a “Prime Number Theorem” for polynomials over finite fields: a fraction asymptotic 
to 1/n of the polynomials in F p[X] are irreducible. Furthermore, since J (z) is logarithmic and 
P is obtained by a multiset construction, we have an unlabelled exp—log scheme, to which 
Theorem VII.1 applies. As a consequence: 


The number of factors of a random polynomial of degree n has mean and variance each asymp- 
totic to log n; its distribution is concentrated. 


(See Figure VII.3 for an illustration; the mean value estimate appears in [378, Ex. 4.6.2.5].) We 
shall revisit this example in Chapter IX, p. 672, and establish a companion Gaussian limit law 
for the number of irreducible factors in a random polynomial of large degree. This and similar 
developments lead to a complete analysis of some of the basic algorithms known for factoring 
polynomials over finite fields; see [236]. ...... 0... cece cece eee ene e nes | 


> VII.2. The divisor function for polynomials. Let 6(@) for w € P be the total number of 
monic polynomials (not necessarily irreducible) dividing w: if wo = =a) uk, where the 7; 


are distinct irreducibles, then 6(@) = (e; + 1)--- (eg, +1). One has 
"ITT jo1G +224 +3274 +++) Ee] P@)? 
"VT jai teh tite.) MPG)’ 


so that the mean value of 6 over Py is exactly (n + 1). This evaluation is relevant to poly- 
nomial factorization over Z since it gives an upper bound on the number of irreducible factor 
combinations that need to be considered in order to lift a factorization from F p(X) to Z(X); 
see [379, 599]. dq 


> VIL3. The cost of finding irreducible polynomials. Assume that it takes expected time ft (1) to 
test arandom polynomial of degree n for irreducibility. Then it takes expected time ~ nt(n) to 
find a random irreducible polynomial of degree n: simply draw a polynomial at random and test 
it for irreducibility. (Testing for irreducibility can itself be achieved by developing a polynomial 


Ep, 0) = 
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factorization algorithm which is stopped as soon as a non-trivial factor is found. See works by 
Panario et al. for detailed analyses of this strategy [468, 469].) 


Profiles of exp-log structures. Under the exp—log conditions, it is also possible 
to analyse the profile of structures, that is, the number of components of size r for 
each fixed r. The Poisson distribution (Appendix C.4: Special distributions, p. 774) 
of parameter v is the law of a discrete random variable Y such that 


k 
au%)=e"I-) PIV =k) = ae 


A variable Y is said to be negative binomial of parameter (m, a) if its probability 
generating function and its individual probabilities satisfy: 


u¥) = (=) : Serene (" a > ‘a ~ a)". 


au 


(The quantity P(Y = k) is the probability that the mth success in a sequence of inde- 
pendent trials with individual success probability a occurs at time m + k; see [206, 
p. 165] and Appendix C.4: Special distributions, p. 774.) 

Proposition VII.1 (Profiles of exp—log structures). Assume the conditions of Theo- 
rem VII.1 and let X“) be the number of G-components of size r in an F-object. In 
the labelled case, X“) admits a limit distribution of the Poisson type: for any fixed k, 


pk 


(17) lim Po (X =k)=e"—, v=grp", gr =([2"]GQ). 
n—-> 00 is k} 
In the unlabelled case, X) admits a limit distribution of the negative-binomial type: 
for any fixed k, 
(18) 


Geatk=1 
lim Pz (X =k) = ( mh asa —a)", a=p", G,=["]G(2). 
n> 0o k 


Proof. In the labelled case, the BGF of F with u marking the number X“) of r— 
components is 


F(z, u) = exp ((u— 1g,z") F(z). 
Extracting the coefficient of u* leads to 
(grz")* 
bu(c) = Ww], u) = exp (—gr2’) = 
The singularity type of #;(z) is that of F(z) since the prefactor (an exponential mul- 
tiplied by a polynomial) is entire, so that singularity analysis applies directly. As a 
consequence, one finds 


F(z). 


ryk 
tere ) . ([z" F (z)) , 


which provides the distribution of X (") under the form stated in (17). 
In the unlabelled case, the starting BGF equation is 


eel. G, 
Few =(7 =) F(z), 


[<" x (z) ~ exp (—grp") 


1— uz" 
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and the analytic reasoning is similar to the labelled case. | 


Proposition VII.1 will be revisited in Example IX.23, p. 675, when we examine 
continuity theorems for probability generating functions. Its unlabelled version covers 
in particular polynomials over finite fields; see [236, 372] for related results. 


> VIIL.4. Mean profiles. The mean value of X (") satisfies 


pr 
1— pt’ 


Bp,(X)~ erp’, — Eg, (X) ~ G, 


in the labelled and unlabelled (multiset) case, respectively. In particular: the mean number of 
roots of a random polynomial over F p that lie in the base field F p is asymptotic to rae Also: 


the probability that a polynomial has no root in the base field is asymptotic to (1 —1/p)?. (For 
random polynomials with real coefficients, a famous result of Kac (1943) asserts that the mean 


number of real roots is ~ 2 log n; see [185].) <J 


> VIL5. Profiles of powersets. In the case of unlabelled powersets F = PSET(G) (no repeti- 
tions of elements allowed), the distribution of X (") satisfies 


r 


G 
lim Pe =H = (Gata —wOK, a= iP 
noo it k I+ pr’ 
i.e., the limit is a binomial law of parameters (G;, p’ /(1 + p")). <J 


VII. 3. Simple varieties of trees and inverse functions 


A unifying theme in this chapter is the enumeration of rooted trees determined 
by restrictions on the collection of allowed node degrees (Sections I. 5, p. 64 and II. 5, 
p. 125). Some set Q C Zso containing O (for leaves) and at least another num- 
ber d > 2 (to avoid trivialities) is fixed; in the trees considered, all outdegrees of 
nodes are constrained to lie in Q. Corresponding to the four combinations, unla- 
belled/labelled and plane/non-plane, there are four types of functional equations sum- 
marized by Figure VII.4. In three of the four cases, namely, 


unlabelled plane, labelled plane, and labelled non-plane, 


the generating function (OGF for unlabelled, EGF for labelled) satisfies an equation 
of the form 


(19) y(z) = z@((@)). 


In accordance with earlier conventions (p. 194), we name simple variety of trees any 
family of trees whose GF satisfies an equation of the form (19). (The functional equa- 
tion satisfied by the OGF of a degree-restricted variety of unlabelled non-plane trees 
furthermore involves a Pélya operator ®, which implies the presence of terms of the 
form y(z7), y(z3), ...: such cases are discussed below in Section VII. 5.) 

The relation y = z@(y) has already been examined in Section VI.7, p. 402, 
from the point of view of singularity analysis. For convenience, we encapsulate into a 
definition the conditions of the main theorem of that section, Theorem VI.6, p. 404. 
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plane non-plane 
V= Z x SEQQ(V) Y= Z x MSETQ(V) 
Unlabelled (OGF) | V(z) = z6(V(z)) V(z) = z®(V(z))) 
dU) = Vocg uv? (® a Polya operator) 
V = Z x SEQQ(Y) V= Z x SETQ(V) 
Labelled (EGF) | V(z) = zg(V(@)) V(z) = 2o(V@)) 
$W) = Docu? | $0) := Loco So 


Figure VII.4. Functional equations satisfied by generating functions (OGF V(z) or 
EGF V (z)) of degree-restricted families of trees. 


Definition VII.3. Let y(z) be a function analytic at 0. It is said to belong to the 
smooth inverse-function schema if there exists a function d(u) analytic at 0, such 
that, in a neighbourhood of 0, one has 


y(z) = z2@0@)), 


and $(u) satisfies the following conditions. 
(Hi) The function $(u) is such that 


(20) $(0) 40, [u"]du) 20, du) ¥ dot du. 


(H2) Within the open disc of convergence of ¢ at 0, |z| < R, there exists a (nec- 
essarily unique) positive solution to the characteristic equation: 


(21) dr,0<1t<R, (t)—td'(t)=0. 


A class Y whose generating function y(z) (either ordinary or exponential) satisfies 
these conditions is also said to belong to the smooth inverse-function schema. 

The schema is said to be aperiodic if #(u) is an aperiodic function of u (Defini- 
tion IV.5, p. 266). 


VII. 3.1. Asymptotic counting. As we saw on general grounds in Chapters IV 
and VI, inversion fails to be analytic when the first derivative of the function to be 
inverted vanishes. The heart of the matter is that, at the point of failure y = 1, 
corresponding to z = t/#(t) (the radius of convergence of y(z) at 0), the dependency 
y + z becomes quadratic, so that its inverse z }» y gives rise to a square-root 
singularity (hence the characteristic equation). From here, the typical n~>/* term in 
coefficient asymptotics results (Theorem VI.6, p. 404). In view of our needs in this 
chapter, we rephrase Theorem V1.6 as follows. 


Theorem VII.2. Let y(z) belong to the smooth inverse-function schema in the ape- 
riodic case. Then, with t the positive root of the characteristic equation and p = 
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t/P(t), one has 


oe OO Pe 
DO = W506") Vand i+o(Z)]. 


As we also know from Theorem VI.6 (p. 404), a complete (and locally conver- 
gent) expansion of y(z) in powers of ./1 — z/p exists, starting with 


/2 
om y@=t-yVl-z/pt+OU-z/p), y= rat 


which implies a complete asymptotic expansion for y, = [z”]y(z) in odd powers of 
1/,/n. (The statement extends to the aperiodic case, with the necessary condition 
thatn = 1 mod p, when ¢ has period p.) 

We have seen already that this framework covers binary, unary—binary, general 


Catalan, as well as Cayley trees (Figure VI.10, p. 406). Here is another typical appli- 
cation. 


Example V11.5. Mobiles. A (labelled) mobile, as defined by Bergeron, Labelle, and Ler- 
oux [50, p. 240], is a (labelled) tree in which subtrees dangling from the root are taken up to 


OMIA RA 


1 2 3!+3=9 444+4x24+4x34+4x3x2= 68 


(Think of Alexander Calder’s creations.) The specification and EGF equation are 


M = Z«(1+CycM) = M@)=2(1+be—7). 


(By definition, cycles have at least one components, so that the neutral structure must be added 
to allow for leaf creation.) The EGF starts as M(z) = z+ 25 + 9x + 68 a + 7305 eee, 
whose coefficients constitute E7S 4038037. 


The verification of the conditions of the theorem are immediate. We have d(u) = 1+ 
log(1 — u)—!, whose radius of convergence is 1. The characteristic equation reads 


1 
1+1 - = 0, 
bs ee l-t 


which has a unique positive root at t = 0.68215. (In fact, one has t = 1 — 1/T(e~?), with T 
the Cayley tree function.) The radius of convergence is p = 1/¢/(t) = 1 — t. The asymptotic 
formula for the number of mobiles then results: 


1 
— Mn ~C- A™n—3/2, where C =0.18576, A= 3.14461. 
ne 


(This example is adapted from [50, p. 261], with corrections.) ............. 0. cece eee eee | 
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> VIL.6. Trees with node degrees that are prime numbers. Let P be the class of all unlabelled 
plane trees such that the (out)degrees of internal nodes belong to the set of prime numbers, 
{2,3,5,...}. One has P(z) =z+22+724*422°+672948z! +2928 + 50794... , and 
Py ~ CA"n—3/2, with A = 2.79256 84676. The asymptotic form “forgets” many details of the 


distribution of primes, so that it can be obtained to great accuracy. (Compare with Example V.2, 
p. 297 and Note VII.24, p. 480.) 


VII. 3.2. Basic tree parameters. Throughout this subsection, we consider a sim- 
ple variety of trees V, whose generating function (OGF or EGF, as the case may be) 
will be denoted by y(z), satisfying the inverse relation y = z@(y). In order to place 
all cases under a single umbrella, we shall write y, = [z”]y(z), so that the number of 
trees of size n is either V, = y, (unlabelled case) or V, = n!y, (labelled case). We 
postulate throughout that y(z) belongs to the smooth inverse-function schema and is 
aperiodic. 

As already seen on several occasions in Chapter III (Section IIL 5, p. 181), addi- 
tive parameters lead to generating functions that are expressible in terms of the basic 
tree generating function y(z). Now that singularity analysis is available, such gen- 
erating functions can be exploited systematically, with a wealth of asymptotic esti- 
mates relative to trees of large sizes coming within easy reach. The universality of the 
square-root singularity among varieties of trees that satisfy the smoothness assump- 
tion of Definition VII.3 then implies universal behaviour for many tree parameters, 
which we now list. 


(i) Node degrees. The degree of the root in a large random tree is O(1) on 
average and with high probability, and its asymptotic distribution can be 
generally determined (Example VII.6). A similar property holds for the 
degree of a random node in a random tree (Example VIL.8). 

(ii) Level profiles can also be determined. The quantity of interest is the mean 
number of nodes in the kth layer from the root in a random tree. It is seen 
for instance that, near the root, a tree from a simple variety tends to grow 
linearly (Example VII.7), this in sharp contrast with other random tree mod- 
els (for instance, increasing trees, Subsection VII. 9.2, p. 526), where the 
growth is exponential. This property is one of the numerous indications that 
random trees taken from simple varieties are skinny and far from having a 
well-balanced shape. A related property is the fact that path length is on 
average O(n./n) (Example VII.9), which means that the typical depth of a 
random node in a random tree is O(./n). 


These basic properties are only the tip of an iceberg. Indeed, Meir and Moon, who 
launched the study of simple varieties of trees (the seminal paper [435] can serve as 
a good starting point) have worked out literally several dozen analyses of parameters 
of trees, using a strategy similar to the one presented here®. We shall have occasion, 
in Chapter IX, to return to probabilistic properties of simple varieties of trees satisfy- 
ing the smooth inverse-function schema—we only indicate here for completeness that 


6The main difference is that Meir and Moon appeal to the Darboux—Pélya method discussed in Sec- 
tion VI. 11 (p. 433) instead of singularity analysis. 
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Tree p(w) T, p PGF of root degree (type) 

simple variety | — — ug’ (tu)/d' (t) 

binary (l+w)? 1,4 sut su? (Bernoulli) 
unary—binary | l1+w+ w 1, 3 zu + Su? (Bernoulli) 

general (1—w)7! hs i u/(2—u)* (sum of two geometric) 
Cayley e” Liew! | wet! (shifted Poisson) 


Figure VII.5. The distribution of root degree in simple varieties of trees of the 
smooth inverse-function schema. 


height is known generally to scale as ./n and is associated to a limiting theta distribu- 
tion (see Proposition V.4, p. 329 for the case of Catalan trees and Subsection VII. 10.2, 
p. 535, for general results), with similar properties holding true for width as shown by 
Odlyzko—Wilf and Chassaing—Marckert-Yor [112, 463]. 


Example V1.6. Root degrees in simple varieties. | Here is an immediate application of 
singularity analysis, one that exemplifies the synthetic type of reasoning that goes along with 
the method. Take for notational simplicity a simple family Y that is unlabelled, with OGF 
V(z) = y(z). Let VIA] be the subset of V composed of all trees whose root has degree equal 
to k. Since a tree in VI] is formed by appending a root to a collection of k trees, one has 


VEN(z) = dezy(z)®, be = [w* 1b (w). 


For any fixed k, a singular expansion results from raising both members of (22) to the kth power; 
in particular, 


(23) VEN (z) = dz [« —kyrk-1 fy 240 (: 2 :)] 
p p 


This is to be compared with the basic estimate (22): the ratio vik} /Vn is then asymptotic to 
the ratio of the coefficients of ./1 — z/p in the corresponding generating functions, vik} (z) and 
V(z) = y(z). Thus, for any fixed k, we have found that 


(24) = pkoyt*—! + O(n"), 


(The error term can be strengthened to O(n-!) by pushing the expansion one step further.) 
The ratio vill / Vn is the probability that the root of a random tree of size n has degree k. 

Since p = 1/¢'(z), one can rephrase (24) as follows: In a smooth simple variety of trees, the 

random variable A representing root-degree admits a discrete limit distribution given by 

kot k-1 

b(t) © 
(By general principles expounded in Chapter IX, convergence is uniform.) Accordingly, the 
probability generating function (PGF) of the limit law admits the simple expression 


By, (u4) = ug'(cu)/4'(). 


(25) lim, Py, (A =k) = 
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The distribution is thus characterized by the fact that its PGF is a scaled version of the derivative 
of the basic tree constructor ¢(w). Figure VII.5 summarizes this property together with its 
specialization to our four pilot examples. ............ 0. cece cece cence eee eee een e ene | 


Additive functionals. Singularity analysis applies to many additive parameters of 
trees. Consider three tree parameters, ¢, 7, o satisfying the basic relation, 


deg(t) 
(26) EH) =nt)+ DY o(t;), 


j=l 


which can be taken to define ¢(t) in terms of the simpler parameter y(t) (a “toll”, cf 
Subsection VI. 10.3, p. 427) and the sum of values of o over the root subtrees of ¢ 
(with deg(t) the degree of the root and r; the jth root-subtree of r). In the case of a 
recursive parameter, € = o, unwinding the recursion shows that €(t) := >°,_., n(s), 
where the sum is extended to all subtrees s of t. As we are interested in average-case 
analysis, we introduce the cumulative GFs, 


2) 2@=> 602" AG => 70". TOs >. Oe") 


assuming again an unlabelled variety of trees for simplicity. 
We first state a simple algebraic result which formalizes several of the calculations 
of Section III. 5, p. 181, dedicated to recursive tree parameters. 


Lemma VII.1 (Iteration lemma for trees). For tree parameters from a simple variety 
with GF y(z) that satisfy the additive relation (26), the cumulative generating func- 
tions (27), are related by 


(28) B(z) = H@) +2¢'O()) ©). 

In particular, if € is defined recursively in terms of n, that is, ¢ = ¢, one has 
- H(z zy (z 

(29) (z) = ON Oa Gy. 


1-262) yy) 
Proof. We have 
deg(t) 


E(z) = A(z) + E(2), where &(z) := by zit >y o(t;) 
gah 


teV 


Spitting the expression of E(z) according to the values r of root degree, we find 


EQ) = Digre! Ht tll 6 (1) +0) +--- +0) 
r>0 
= 2) (Z@y@™ + ¥@E@y@)"* +++ y@™ E()) 
r>0 
= 220): >i (r6y@"), 


r>0 


which yields the linear relation expressing = in (28). 
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In the recursive case, the function = is determined by a linear equation, namely 
=(z) = A(z) + z¢’(y(z)) E(z), which, once solved, provides the first form of (29). 
Differentiation of the fundamental relation y = z#(y) yields the identity 


ya-z@G))=dQ)=2, ie, 1-2 G)=—, 
Zz zy 


from which the second form results. |_| 


> VIL.7. A symbolic derivation. For a recursive parameter, we can view &(z) as the GF of trees 
with one subtree marked, to which is attached a weight of 7. Then (29) can be interpreted as 
follows: point to an arbitrary node at a tree in V (the GF is zy’(z)), remove the tree attached to 
this node (a factor of y(z)~4), and replace it by the same tree but now weighted by 7 (the GF is 
H(z)). <q 
> VIL8. Labelled varieties. Formulae (28) and (29) hold verbatim for labelled trees (either 
of the plane or non-plane type), provided we interpret y(z), =(z), H(z) as EGFs: 2(z) := 
rev ez! elt, and so on. <J 


Example VMI.7. Mean level profile in simple varieties. The question we address here is that 
of determining the mean number of nodes at level k (i.e., at distance k from the root) in a 
random tree of some large size n. (An explicit expression for the joint distribution of nodes at 
all levels has been developed in Subsection III. 6.2, p. 193, but this multivariate representation 
is somewhat hard to interpret asymptotically.) 

Let &(t) be the number of nodes at level k in tree t. Define the generating function of 
cumulated values, 

Xe@) = DIO 
teV 

Clearly, Xo(z) = y(z) since each tree has a unique root. Then, since the parameter ¢; is the sum 
over subtrees of parameter ¢,_1, we are in a situation exactly covered by (28), with y(t) = 0. 
The recurrence X;(z) = z¢’(y(z)) 2,1 (z), is then immediately solved, to the effect that 


(30) Xx(2) = (z6’(v@))* y@). 


Making use of the (analytic) expansion of ¢’ at rt, namely, ¢’(y) ~ ¢/(t) + ”(t)(y — 1) and 
of p¢’(t) = 1, one obtains, for any fixed k: 


Ki) ~ (1 ky 08") f= (©- 71-2) ~2- repo" + nf - 2. 
p p p 


Thus comparing the singular part of X,(z) to that of y(z), we find: For fixed k, the mean 
number of nodes at level k in a tree is of the asymptotic form 


Ey [&]~ Ak+1,  A:=tpd"(r). 


This result was first given by Meir and Moon [435]. The striking fact is that, although the 
number of nodes at level k can at least double at each level, growth is only linear on average. 
In figurative terms, the immediate vicinity of the root starts like a “cone”, and trees of simple 
varieties tend to be rather skinny near their base. 

When used in conjunction with saddle-point bounds (p. 246), the exact GF expression 
of (30) additionally provides a probabilistic upper bound on the height of trees of the form 
O(n!/2+°) for any 6 > 0. Indeed restrict z to the interval (0, p) and assume that k = ni/2+o. 
Let y be the height parameter. First, we have 


(31) Py, (% =k) = Ey, (& = 11) < Ey, &). 
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Figure VII.6. Three random 2-3 trees (Q = {0, 2, 3}) of size n = 500 have height, 
respectively, 48, 57, 47, in agreement with the fact that height is typically O(,/n). 


Next by saddle-point bounds, for any legal positive x (that is, 0 < x < Reonv(¢)), 
; k = a 

(32) Ey, (fk) S (x"(y@)))" vO)” st (xd (V@™))) 2. 

Fix now x = p — ne Local expansions then show that 


(33) log ((x¢'@))* 27") < —Kn*4/2 40 (n?) 


for some positive constant K. Thus, by (31) and (33): Jn a smooth simple variety of trees, 
the probability of height exceeding ni/2+9 js exponentially small, being of the rough form 
exp(—n0/ 2). Accordingly, the mean height is O(n!/ 2+0) for any 6 > 0. The moments of 
height were characterized in [246]: the mean is asymptotic to 1,/n and the limit distribution is 
of the Theta type encountered in Example V.8, p. 326, in the particular case of general Catalan 
trees, where explicit expressions are available. (Further local limit and large deviation estimates 
appear in [230]; we shall return to the topic of tree height in Subsection VII. 10.1, p. 532.) 
Figure VII.6 displays three random trees of sizen = 500. ......... 0. cece eee | 


> VIL.9. The variance of level profiles. The BGF of trees with u marking nodes at level k 
has an explicit expression, in accordance with the developments of Chapter III. For instance 
for k = 3, this is zf(zh(zh(uy(z)))). Double differentiation followed by singularity analysis 
shows that 


1 1 
Vy, [ox] ~ 54k — 5AGB—4A)k + 24-1, 


another result of Meir and Moon [435]. The precise analysis of the mean and variance in 
the interesting regime where k is proportional to ./n is also given in [435], but it requires 
either the saddle-point method (Chapter VIIJ) or the adapted singularity analysis techniques of 
Theorem IX.16, p. 709. dq 


Example V11.8. Mean degree profile. Let &(t) = &(t) be the number of nodes of degree k 
in random tree of some variety V. The analysis extends that of the root degree seen earlier. The 
parameter € is an additive functional induced by the basic parameter 7(t) = nx (t) defined by 
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ng(t) := [deg(t) = k]]. By the analysis of root degree, we have for the GF of cumulated values 
associated to 


H(z) = gezy@)’, be = [w* 1h (w), 


so that, by the fundamental formula (29), 


X(2) = wey = 27 bgy(z)k | y'(). 


The singular expansion of zy’(z) can be obtained from that of y(z) by differentiation (Theo- 
rem VL8, p. 419), 


zy’ (z) = + 0O(), 


1 1 
2” fl=ap 
the corresponding coefficient satisfying [z’](zy’) = nyy. This gives immediately the singularity 
type of X, which is of the form of an inverse square root. Thus, 
X (2) ~ poet"! @y'@)) 
implying (9 = t/#(c)) 
Xn &k e 
nyn — (t) 


Consequently, one has: 


Proposition VII.2. Jn a smooth simple variety of trees, the mean number of nodes of degree k 
is asymptotic to Ayn, where Ay := oyt*/¢(t). Equivalently, the probability distribution of the 
degree A* of a random node in a random tree of size n satisfies 


k 
Set paw. Wy,,t . pu 
jim Pn(A*) = 2K = Oe with PGF : Die ry esd 
For the usual tree varieties this gives: 
Tree d(w) T, p probability distribution (type) 
binary (i+w)? i,q. | PGF: §+3u+4u? ~~ (Bernoulli) 
unary—binary | 1+ w+ w 1, 5 PGF: ; + zu + que (Bernoulli) 
general (—w)7! bs i PGF: 1/(2 — u) (Geometric) 
Cayley e” 1,e7! | PGF: e#7! (Poisson) 


For instance, asymptotically, a general Catalan tree has on average n/2 leaves, n/4 nodes of 
degre 1 n/8 of degree 2, and so on; a Cayley tree has ~ ne~!/k! nodes of degree k; for binary 
(Catalan) trees, the four possible types of nodes each appear with asymptotic frequency 1/4. 
(These data agree with the fact that a random tree under V,, is distributed like a branching 
process tree determined by the PGF ¢(urt)/A(r); see Subsection II. 6.2, p. 193.) ......... a 


> VIL.10. Variances. The variance of the number of k—ary nodes is ~ vn, so that the distribu- 
tion of the number of nodes of this type is concentrated, for each fixed k. The starting point is 
the BGF defined implicitly by 


¥(z,u) =z (PP, w)) + dee — DYE, uw), 


upon taking a double derivative with respect to u, setting u = 1, and finally performing singu- 
larity analysis on the resulting GF. 
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> VII.11. The mother of a random node. The discrepancy in distributions between the root 
degree and the degree of a random node deserves an explanation. Pick up a node distinct from 
the root at random in a tree and look at the degree of its mother. The PGF of the law is in 
the limit ud’ (ur)/'(r). Thus the degree of the root is asymptotically the same as that of the 
mother of any non-root node. 

More generally, let X have distribution py := P(X = k). Construct a random variable Y 
such that the probability g, := P(Y = &) is proportional both to k and pz. Then for the 
associated PGFs, the relation g(u) = p’(u)/p’(1) holds. The law of Y is said to be the size- 
biased version of the law of X. Here, a mother is picked up with an importance proportional to 
its degree. In this perspective, Eve appears to be just like a random mother. 


Example V1.9. Path length. Path length of a tree is the sum of the distances of all nodes to 
the root. It is defined recursively by 
deg(t) 
é() =|t1-14+ >) eG) 
j=! 
(Example III.15, p. 184 and Subsection VI. 10.3, p. 427). Within the framework of additive 


functional of trees (28), we have y(t) = |t| — 1 corresponding to the GF of cumulated values 
H(z) = zy’(z) — y(z), and the fundamental relation (29) gives 


zy'(z) _ 2y'@)? 
y@) y@) 
The type of y’(z) at its singularity is Z~1/2, where Z = (1 — z/p). The formula for X (z) 
involves the square of y’, so that the singularity of X(z) is of type Z as resembling a simple 
pole. This means that the cumulated value X, = [z”]X (z) grows like p—”, so that the mean 
value of € over V, has growth n>/2, Working out the constants, we find 
oa 
X(2) tay’) ~ = 4 O(27 1», 
4c Z 


X(z) = (zy'(z) — y(z)) — zy'(2). 


Asa consequence: 


Proposition VII.3. In a random tree of size n from a smooth simple variety, the expectation of 
path length satisfies 


/ F | @(z) 
(34) Ey, €) = Av an3 + O(n), A= apa) 


For our classical varieties, the main terms of (34) are then: 


Binary unary—binary general Cayley 
~Van ~ 5 3an> 5Vnn3 ~ 5mn?, 


Observe that the quantity : Ey), (¢) represents the expected depth of a random node in a random 
tree (the model is then [1 ..] x Vy), which is thus ~ 1./n. (This result is consistent with height 
of a tree being with high probability of order O(n!/2).) oo. ook. c ccc ccc c eee eee | 


> VIL.12. Variance of path length. Path length can be analysed starting from the bivariate gen- 
erating function given by a functional equation of the difference type (see Chapter III, p. 185), 
which allows for the computation of higher moments. The standard deviation is found to be 
asymptotic to Ann3/ 2 for some computable constant Az > 0, so that the distribution is spread. 
Louchard [416] and Takacs [566] have additionally worked out the asymptotic form of all mo- 
ments, leading to a characterization of the limit law of path length that can be described in terms 
of the Airy function: see Subsection VII. 10.1, p. 532. dq 
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# components ~ 5 logn Tail length (A) ~ J/mn/8 
# cyclic nodes ~ /an/2 Cycle length (uw) ~ /xn/8 
# terminal nodes ~ ne! Tree size ~n/3 

# nodes of in-degreek ~ ne—* /k!} Component size ~ 2n/3 


Figure VII.7. Expectations of the main additive parameters of random mappings of size n. 


> VIL.13. Generalizations of path length. Define the subtree size index of order a € Rs 
to be €(t) = Eq (t) := D°,~; |s]*, where the sum is extended to all the subtrees 5 of ¢. This 
corresponds to a recursively defined parameter with y(t) = |t|*. The results of Section VI. 10 
relative to Hadamard products and polylogarithms make it possible to analyse the singularities 
of H(z) and X(z). It is found that there are three different regimes 


i 4d i 
a> a=5 a<7 


Ey,@) ~ Kan* Ey,©) ~ Kijgnlogn Ey, (@) ~ Kan 


where each Kg is a computable constant. (This extends the results of Subsection VI. 10.3, 
p. 427 to all simple varieties of trees that are smooth.) dq 


VII. 3.3. Mappings. The basic construction of mappings (Chapter II, p. 129), 


F = SET(K) ie ae) 
(35) K = Cyc(T) = KS tog == 
T = ZxSEVT) — aa 


builds maps from Cayley trees, which constitute a smooth simple variety. The con- 
struction lends itself to a number of multivariate extensions. For instance, we al- 
ready know from Example VII.3, p. 449, that the number of components is asymptotic 
to , log n, both on average and in probability. 

Take next the parameter y equal to the number of cyclic points, which gives rise 
to the BGF 


F(z, u) = exp (108 i =z) =(1-uT)"}. 


The mean number of a cyclic points, for a random mapping of size n, is accordingly 
n” ou 


= a ny T 
jaa ~~ yn ‘ (= T)?" 
Singularity analysis is immediate, since 
T 1 1 7 T 1 
ad = T) zoe! 2 1 =2Z% 


Thus: The mean number of cyclic points in a random mapping of size n is asymptotic 


to /mn/2. 
Many parameters can be similarly analysed in a systematic manner, thanks to 
generating function, as shown in the survey [247]: see Figure VII.7 for a summary 


! 0 
36) an S=Ex (iy) = —Ie"] (fre 


n 


TGIF 230 2° 
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Figure VII.8. Two views of a random mapping of size n = 100. The random map- 
ping has three connected components, with cycles of respective size 2, 4, 4; it is made 
of fairly skinny trees, has a giant component of size 75, and its diameter equals 14. 


of results whose proofs we leave as exercises to the reader. The left-most table de- 
scribes global parameters of mappings; the right-most table is relative to properties of 
random point in random n-mapping: / is the distance to its cycle of a random point, 
ut the length of the cycle to which the point leads, tree size and component size are, 
respectively, the size of the largest tree containing the point and the size of its (weakly) 
connected component. In particular, a random mapping of size n has relatively few 
components, some of which are expected to be of a large size. 

The estimates of Figure VII.7 are in fair agreement with what is observed on 
the single sample of size n = 100 of Figure VII.8: this particular mapping has 3 
components (the average is about 2.97), 10 cyclic points (the average, as calculated 
in (36), is about 12.20), but a fairly large diameter—the maximum value of 2 + y, 
taken over all nodes—equal to 14, and a giant component of size 75. The proportion 
of nodes of degree 0, 1, 2,3, 4 turns out to be, respectively, 39%, 33%, 21%, 7%, 
1%, to be compared against the asymptotic values given by a Poisson law of rate | 
(analogous to the degree profile of Cayley trees found in Example VII.8); namely 
36.7%, 36.7%, 18.3%, 6.1%, 1.5%. 


> VIL.14. Extremal statistics on mappings. Let 2™*, w™*, and p™* be the maximum val- 
ues of 2, 4, and p, taken over all the possible starting points, where p = 2+ yw. Then, the 
expectations satisfy [247] 


ER A) Ka, Ee Ga) eya/t, Ep (po) K34/R, 
where xj = V2z log2 = 1.73746, x2 = 0.78248 and x3 = 2.4149. (For the estimate relative 
to p™®*, see also [12].) 


The largest tree and the largest components have expectations asymptotic, respectively, to 
Oyn and dan, where 0; = 0.48 and 6) = 0.7582. 


mi 
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The properties outlined above for the class of all mappings also prove to be uni- 
versal for a wide variety of mappings defined by degree restrictions of various sorts: 
we outline the basis of the corresponding theory in Example VII.10, then show some 
surprising applications in Example VII.11. 


Example V¥1.10. Simple varieties of mappings. Let Q be a subset of the integers containing 0 
and at least another integer greater than 1. Consider mappings ¢ € F such that the number of 
preimages of any point is constrained to lie in Q. Such special mappings may serve to model 
the behaviour of special classes of functions under iteration, and are accordingly of interest in 
various areas of computational number theory and cryptography. For instance, the quadratic 
functions (x) = x2 +a over F p have the property that each element y has either zero, one, 
or two preimages (depending on whether y — a is a quadratic non-residue, 0, or a quadratic 
residue). 

The basic construction of mappings needs to be amended. Start with the family of trees T 
that are the simple variety corresponding to Q: 


u® 


(37) T =7z¢(T), d(w) := aie 
oeQ 


At any vertex on a cycle, one must graft r trees with the constraint that r + 1 € Q (since one 
edge is coming from the cycle itself). Such legal tuples with a root appended are represented by 


(38) U=<'(T), 


since ¢ is an exponential generating function and shift (r  (r + 1)) corresponds to differenti- 
ation. Then connected components and components are formed in the usual way by 


1 1 
39 K=log———, F=exp(K)=——. 
(39) Sa exp(K) = 7 


The three relations (37), (38), (39) fully determine the EGF of Q-restricted mappings. 

The function ¢ is a subseries of the exponential function; hence, it is entire and it satisfies 
automatically the smoothness conditions of Theorem VII.2, p. 453. With c the characteristic 
value, the function T(z) then has a square-root singularity at p = t/P(r). The same holds for 
U, which admits the singular expansion (with y; a constant simply related to y of equation (22)) 


z 
(40) Ui) ela pf a; 
p 
since U = z¢’/(T). Thus, eventually: 
K 1 
F(z)~ : Ki=—. 
1-z Y1 
p 
-1/2 


There results the universality of ann counting law in such constrained mappings: 


Proposition VII.4. Consider mappings with node degrees in a set Q © Zo, such that the 
corresponding tree family belongs to the smooth implicit function schema and is aperiodic. The 
number of mappings of size n satisfies 


a ae _ | eG 
nl” Jan? * V6 @)b"()’ 


This statement nicely extends what is known to hold for unrestricted mappings. The anal- 
ysis of additive functionals can then proceed on lines very similar to the case of standard map- 
pings, to the effect that the estimates of the same form as in Figure VII.7 hold, albeit with 
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different multiplicative factors. The programme just sketched has been carried out in a thor- 
ough manner by Arney and Bender [18], whose paper provides a detailed treatment. ...... | 


Example V¥1.11. Applications of random mapping statistics. There are interesting conse- 
quences of the foregoing asymptotic theory of random mappings in several areas of computa- 
tional mathematics, as we now briefly explain. 


Random number generators. Many (pseudo) random number generators operate by iterat- 
ing a given function g over a finite domaine €; usually, € is a large integer interval [0.. N — 1]. 
Such a scheme produces a pseudo-random sequence ug, “1, U2, ..., where ug is the “seed” and 


Unt = Gn). 
Particular strategies are known for the choice of g, which ensure that the “period” (the maxi- 
mum of p = 2+ u, where / is the distance to cycle and y is the cycle’s length) is of the order 
of N: this is for instance granted by linear congruential generators and feedback register algo- 
rithms; see Knuth’s authoritative discussion in [379, Ch. 3]. By contrast, a randomly chosen 
function g has expected O(N) cycle time (Figure VII.7, p. 462), so that it is highly likely to 
give rise to a poor generator. As the popular adage says: “A random random number generator 
is bad!”’. Accordingly, one can make use of the results of Figure VII.7 and Example VII.10 in 
order to compare statistical properties of a proposed random number generator to properties of 
arandom function, and discard the former if there is a manifest closeness. 
For instance, take g to be 


g(x) := x” +1 mod (10° + 3), 


where the modulus is a prime number. A random mapping of size (10° +3) is expected to cycle 
on average after about 1250 steps (the expectation of p = 1+ wis ~ ./a N/2 by Figure VII.7). 
From five starting values uo, we observe the following periods 


ug: 3 31 314 3141 31415 314159 
p=Atm: 1569 687 985 813 557 932 


(41) 


whose magnitude looks suspiciously like N. Such a random number generator is thus to be 
discarded. For similar reasons, von Neumann’s well-known “middle-square” procedure (start 
from an €-digit number, then repeatedly square and extract the middle digits) makes for a rather 
poor random number generator [379, p. 5]. (Related applications to cryptography are presented 
by Quisquater and Delescaille in [501].) 


Floyd’s cycle detection. There is a spectacular algorithm due to Floyd [379, Ex. 3.1.6], 
for cycle detection, which is well worth knowing when one needs to experiment with large 
mappings. Given an initial seed xg and a mapping g, Floyd’s algorithm determines, up to a 
small factor, the value of p(xg) = A(xq) + “(X), using only two registers. The principle is as 
follows. Start a tortoise and a hare on ug at time 0; then, let the tortoise move at speed 1 along 
the rho-shaped path and let the hare move at twice the speed. After A(x) steps, the tortoise 
joins the cycle, from which time on, the hare, which is already on the cycle, will catch the 
tortoise after at most (xo) steps, since their speed differential on the cycle is one. Pictorially: 


ne Rod ‘g 
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In more dignified terms, setting 
Xo =40, Xn¢1 =9(Xn), and = Yo = ug, Yn 1 = P(Y(Xn)), 
we have the property that the first value v such that X, = Y, = X2, must satisfy the inequalities 
(42) Acsvcsktyu <2. 
The corresponding algorithm is then extremely short: 


Algorithm: Floyd’s Cycle Detector: 
tortoise := x9; hare:=xg; v:=0; 
repeat 
tortoise :=g(tortoise); hare :=(g(hare)); v:=v +1; 
until tortoise = hare {v is an estimate of A + yw in the sense of (42)}. 


Pollard’s rho method for integer factoring. Pollard [487] had the insight to exploit Floyd’s 
algorithm in order to develop an efficient integer factoring method. Assume heuristically that a 
quadratic function x 1» x? +a mod Pp, with p a prime number, has statistical properties similar 
to those of a random function (we have verified a particular case by (41) above). It must then 
tend to cycle after about ,/p steps. Let N be a (large) number to be factored, and assume for 
simplicity that N = pq, with p and q both prime (but unknown!). Choose a random a and a 
random initial value xo, fix 


g(x) =x? +a (modN), 


and run the hare-and-tortoise algorithm. By the Chinese Remainder Theorem, the value of a 
number x mod N is determined by the pair (x mod p, x mod q); the tortoise T and the hare H 
can then be seen as running two simultaneous races, one modulo p, the other modulo g. Say 
that p < q. After about ,/p steps, one is likely to have 


H=T _ (modp), 


while, most probably, hare and tortoise will be non-congruent mod gq. In other words, the 
greatest common divisor of the difference (H — T) and N will provide p; hence it factors NV. 
The resulting algorithm is also extremely short: 


Algorithm: Pollard’s Integer Factoring: 
choose a, xg randomly in [0..N — 1]; 
T :=x9; H := x0; 
repeat 
T :=(T? +a)modN; A: (H? +a) +amod N; 
D := gced(H — T,N); 
until D #1 {if D #0, a non-trivial divisor has been found}. 


The agreement with what the theory of random mappings predicts is excellent: one indeed 
obtains an algorithm that factors large numbers N in O(N!/*) operations with high probability 
(see for instance the data in [538, p. 470]). 

Although Pollard’s algorithm is, for very large N, subsumed by other factoring methods, it 
is still the best for moderate values of N or for numbers with small divisors, where it proves far 
superior to trial divisions. Equally importantly, similar ideas serve in many areas of computa- 
tional number theory; for instance the determination of discrete logarithms. (Proving rigorously 
what one observes in simulations is another story: it often requires advanced methods of number 
theory [23;.442]2) scsi ihe se ead st Seen ee eee Boe eed renee | 
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> VIL.15. Probabilities of first-order sentences. A beautiful theorem of Lynch [426], much 
in line with the global aims of analytic combinatorics, gives a class of properties of random 
mappings for which asymptotic probabilities are systematically computable. In mathematical 
logic, a first-order sentence is built out of variables, equality, boolean connectives (V, A, 7, 
etc), and quantifiers (V, 4). In addition, there is a function symbol 9, representing a generic 
mapping. 

Theorem. Given a property P expressed by a first-order sentence, let n(P) be the 

probability that P is satisfied by a random mapping 9 of size n. Then the quantity 

Hoo(P) = limn— soo Un(P) exists and its value is given by an expression consisting 


of integer constants and the operators +, —, x, +, and e*. 
For instance: 
P: gisperm. |g without fixed pt. g has #leaves > 2 
Vxdye(y) =x] Vxrgx) =x [Ax, yle A yAVele]@) Fx AQ) F II 
Hoo(P) : 0 ou 1 


One can express in this language a property like Pj2 : “all cycles of length I are attached to 


-1 
trees of height at most 2”, for which the limit probability is e~ !+¢ Ee The proof of the theo- 
rem is based on Ehrenfeucht games supplemented by ingenious inclusion—exclusion arguments. 
(Many cases, like Pj, can be directly treated by singularity analysis.) Compton [125, 126, 127] 
has produced lucid surveys of this area, known as finite model theory. dq 


VII. 4. Tree-like structures and implicit functions 


The aim of this section is to demonstrate the universality of the square-root sin- 
gularity type for classes of recursively defined structures, which considerably extend 
the case of (smooth) simple varieties of trees. The starting point is the investigation of 
recursive classes , with associated GF y(z), that correspond to a specification: 


(43) Y= 6[Z, Y] = y(z) = Gz, y(z)). 


In the labelled case, y(z) is an EGF and 6 may be an arbitrary composition of basic 
constructors, which is reflected by a bivariate function G(z, w); in the unlabelled case, 
y(z) is an OGF and 6 may be an arbitrary composition of unions, products, and se- 
quences. (Pélya operators corresponding to unlabelled sets and cycles are discussed in 
Section VII. 5, p. 475.) This situation covers structures that we have already seen, like 
Schréder’s bracketing systems (Chapter I, p. 69) and hierarchies (Chapter II, p. 128), 
as well as new ones to be examined here; namely, paths with diagonal steps and trees 
with variable node sizes or edge lengths. 


VII. 4.1. The smooth implicit-function schema. The investigation of (43) ne- 
cessitates certain analytic conditions to be satisfied by the bivariate function G, which 
we first encapsulate into the definition of a schema. 

Definition VII.4. Let y(z) be a function analytic at 0, y(Z) = >> 39 YnZ", with yo = 0 
and yy, > 0. The function is said to belong to the smooth implicit-function schema if 
there exists a bivariate G(z, w) such that 


y(z) = Gz, y(z)), 


where G(z, w) satisfies the following conditions. 
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(ly): G(z, w) = ee &mnz" w" is analytic ina domain |z| < Rand |w| < S, 
for some R, S > 0. 
(Iz): The coefficients of G satisfy 


(44) &nn =9, goo=9, gor Fl, 
8m,n > 0 for some mand for some n > 2. 


(Iz): There exist two numbers r,s, such that0 <r < RandO <s < S, satisfying 
the system of equations, 


(45) G(ir,s)=s, Gy(r,s)=1, with r<R, s</S, 


which is called the characteristic system. 
A class Y with a generating y(z) satisfying y(z) = G(z, y(z)) is also said to 
belong to the smooth implicit-function schema. 


Postulating that G(z, w) is analytic and with non-negative coefficients is a min- 
imal assumption in the context of analytic combinatorics. The problem is assumed 
to be normalized, so that y(0) = 0 and G(0O, 0) = 0, the condition g9,; # 1 being 
imposed to avoid that the implicit equation be of the reducible form y = y+ - -- (first 
line of (44)). The second condition of (44) means that in G(z, y), the dependency on y 
is nonlinear (otherwise, the analysis reduces to rational and meromorphic asymptotic 
methods of Chapter V). The major analytic condition is (13), which postulates the 
existence of positive solutions r,s to the characteristic system within the domain of 
analyticity of G. 

The main result’ due to Meir and Moon [439] expresses universality of the square- 
root singularity together with its usual consequences regarding asymptotic counting. 


Theorem VII.3 (Smooth implicit-function schema). Let y(z) belong to the smooth 
implicit-function schema defined by G(z, w), with (r,s) the positive solution of the 
characteristic system. Then, y(z) converges at z = r, where it has a square-root 
singularity, 


2rGz(r, s) 


YO) Seo py loeir PO Zire. PIS 
Zor Gww( 5) 

the expansion being valid in a A—domain. If, in addition, y(z) is aperiodic’, then r is 

the unique dominant singularity of y and the coefficients satisfy 


"ly@) =. a (1 + o(n')) : 


7This theorem has an interesting history. An overly general version of it was first stated by Bender 
in 1974 (Theorem 5 of [36]). Canfield [102] pointed out ten years later that Bender’s conditions were 
not quite sufficient to grant square-root singularity. A corrected statement was given by Meir and Moon 
in [439] with a further (minor) erratum in [438]. We follow here the form given in Theorem 10.13 of 
Odlyzko’s survey [461] with the correction of another minor misprint (regarding g9,; which should read 
80,1 # 1). A statement concerning a restricted class of functions (either polynomial or entire) already 
appears in Hille’s book [334, vol. I, p. 274]. 

8Tn the usual sense of Definition IV.5, p. 266. Equivalently, there exist three indices i < j < k such 
that yj yj ye #0 and ged(j —i,k -i) = 1. 
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Observe that the statement implies the existence of exactly one root of the char- 
acteristic system within the part of the positive quadrant where G is analytic, since, 
obviously, y, cannot admit two asymptotic expressions with different parameters. A 
complete expansion exists in powers of (1 — z/r)!/? (for y(z)) and in powers of 1/n 
(for y,), while periodic cases can be treated by a simple extension of the technical 
apparatus to be developed. 

The proof of this theorem first necessitates two lemmas of independent interest: 
(i) Lemma VII.2 is logically equivalent to an analytic version of the classical Im- 
plicit Function Theorem found in Appendix B.5: Implicit Function Theorem, p. 753. 
(ii) Lemma VII.3 supplements this by describing what happens at a point where the 
implicit function theorem “fails”. These two statements extend the analytic and sin- 
gular inversion lemmas of Subsection IV. 7.1, p. 275. 


Lemma VII.2 (Analytic Implicit Functions). Let F(z, w) be z bivariate function 
analytic at (z,w) = (z0, wo). Assume that F(zo, Wo) = 0 and Fy(Zo0, wo) # O. 
Then, there exists a unique function y(z) analytic in a neighbourhood of zo such that 
y(Zo) = wo and F(z, y(z)) = 0. 

Proof. This is a restatement of the Analytic Implicit Function Theorem of Appen- 
dix B.5: Implicit Function Theorem, p. 753, upon effecting a translation z +» z+ Zo, 
wr w+ wo. i | 


Lemma VII.3 (Singular Implicit Functions). Let F(z, w) be a bivariate function an- 
alytic at (z, w) = (Zo, Wo). Assume the conditions: F (zo, wo) = 0, Fz(Zo0, wo) # 0, 
Fy (Zo, Wo) = 0, and Fw (Zo, wo) # 0. Choose an arbitrary ray of angle 0 ema- 
nating from zo. Then there exists a neighbourhood Q of zo such that at every point z 
of Q with z # zo and z not on the ray, the equation F(z, y) = 0 admits two analytic 
solutions y,(z) and y2(z) that satisfy, as z > zo: 


2z0Fz(zo0, w 
viz) = yo —y V1 —z/z0 + O 1 —z/z0)), fee | Oe, 
Fow (Zo, wo) 


and similarly for y2 whose expansion is obtained by changing .f to —,/. 


Proof. Locally, near (r,s), the function F(z, w) behaves like 


1 
(46) F + (w—5)Fy + @=1)F; + 5(w— 8)" Fow, 


(plus smaller order terms), where F and its derivatives are evaluated at the point (7, s). 
Since F = F, = 0, cancelling (46) suggests for the solutions of F(z, w) = 0 near 
z =r the form 
w-s=+yVr—z+O0(z-7r), 

which is consistent with the statement. This informal argument can be justified by the 
following steps (details omitted): (a) establish the existence of a formal solution in 
powers of +(1 — z/r)!/*; (b) prove, by the method of majorant series, that the formal 
solutions also converge locally and provide a solution to the equation. 

Alternatively, by the Weierstrass Preparation Theorem (Appendix B.5: Implicit 
Function Theorem, p. 753) the two solutions y;(z), y2(z) that assume the value s 
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0 0.2 0.4 0.6 0.8 1.0 


Figure VII.9. The connection problem for the equation w = az + w? (with explicit 


forms w = (1 + ./1 — z)/2): the combinatorial solution y(z) near z = 0 and the two 
analytic solutions y; (z), y2(z) near z = 1. 


at z =r are solutions of a quadratic equation 
(Y —s)° + b@)(Y —s) +c(z) =0, 


where b and ¢ are analytic at z = r, with b(r) = c(r) = 0. The solutions are then 
obtained by the usual formula for solving a quadratic equation, 


axe ; (—o(2) + Vo@—4e@)., 


which provides for y;(z) an expression as the square-root of an analytic function and 
yields the statement. | 


It is now possible to return to the proof of our main statement. 


Proof. [Theorem VII.3] Given the two lemmas, the general idea of the proof of The- 
orem VII.3 can be easily grasped. Set F(z, w) = w — G(z, w). There exists a unique 
analytic function y(z) satisfying y = G(z, y) near z = 0, by the analytic lemma. On 
the other hand, by the singular lemma, near the point (z, w) = (r,s), there exist two 
solutions y;, y2, both of which have a square root singularity. Given the positive char- 
acter of the coefficients of G, it is not hard to see that, of y;, y2, the function yj (z) is 
increasing as z approaches zo from the left (assuming the principal determination of 
the square root in the definition of y ). A simple picture of the situation regarding the 
solutions to the equation y = G(z, y) is exemplified by Figure VII.9. 

The problem is then to show that a smooth analytic curve (the thin-line curve 
in Figure VII.9) does connect the positive-coefficient solution at 0 to the increasing- 
branch solution at r. Precisely, one needs to check that y;(z) (defined near r) is the 
analytic continuation of y(z) (defined near 0) as z increases along the positive real 
axis. This is indeed a delicate connection problem whose technical proof is discussed 
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in Note VII.16. Once this fact is granted and it has been verified that r is the unique 
dominant singularity of y(z) (Note VII.17), the statement of Theorem VII.3 follows 
directly by singularity analysis. | 
> VIL.16. The connection problem for implicit functions. A proof that y(z) and y,(z) are well 
connected is given by Meir and Moon in the study [439], from which our description is adapted. 

Let p be the radius of convergence of y(z) at 0 and t = y(p). The point p is a singularity 


of y(z) by Pringsheim’s Theorem. The goal is to establish that p = r and t = s. Regarding the 
curve 


C={@y®@) | O4z< >}. 
this means that three cases are to be excluded: 


(a) C stays entirely in the interior of the rectangle 
Ri={(@y) | O<z<r0<yKs}. 


(b) C intersects the upper side of the rectangle R at some point of abscissa rg < r where 
y(ro) = 8. 
(c) C intersects the right-most side of the rectangle R at the point (r, y(r)) with y(r) < s. 


Graphically, the three cases are depicted in Figure VII.10. 


Figure VII.10. The three 
cases (a), (b), and (c), to be 
excluded (solid lines). 


In the discussion, we make use of the fact that G(z, w), which has non-negative coefficients 
is an increasing function in each of its argument. Also, the form 


a y= G29) 
1 — Gw(z, y) ; 


shows differentiability (hence analyticity) of the solution y as soon as Gw(z, y) # 1. 

Case (a) is excluded. Assume that 0 < p <rand0 <t <s. Then, we have Gy(r, 5) = 
1, and by monotonicity properties of Gy, the inequality Gw(p,t) < 1 holds. But then y(z) 
must be analytic at z = p, which contradicts the fact that p is a singularity. 

Case (b) is excluded. Assume that 0 < rg < r and y(r9) = s. Then there are two distinct 
points on the implicit curve y = G(z, y) at the same altitude, namely (rg, s) and (7, s), implying 
the equalities 

y(ro) = Go. yo) = 5 = GC, 5), 
which contradicts the monotonicity properties of G. 

Case (c) is excluded. Assume that y(r) < s. Leta < r bea point chosen close enough tor. 
Then above a, there are three branches of the curve y = G(z, y), namely y(a), yj (a), yo(@), 
where the existence of yj, y2 results from Lemma VII.3. This means that the function y 
G(a, y) has a graph that intersects the main diagonal at three points, a contradiction with the 
fact that G(a, y) is a convex function of y. dq 
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> VIL17. Unicity of the dominant singularity. From the previous note, we know that y(r) = s, 
with r the radius of convergence of y. The aperiodicity of y implies that |y(¢)| < y(r) for all 
|¢| such that |¢| = r and |¢| # r (see the Daffodil Lemma IV.1, p. 266). One then has for any 
such ¢ the property: |Gw(¢, y(©))| < Gv, s) = 1, by monotonicity of Gw. But then by (47) 
above, this implies that y(¢) is analytic at ¢. dq 

The solutions to the characteristic system (45) can be regarded as the intersection 
points of two curves, namely, 


G(r,s)-—s =0, Gy(r,s) = 1. 


Here are plots in the case of two functions G: the first one has non-negative coeffi- 
cients whereas the second one (corresponding to a counterexample of Canfield [102]) 
involves negative coefficients. Positivity of coefficients implies convexity properties 
that avoid pathological situations. 


I 3 Zz 
G(z, y) = ————- - l-y- G(z, y) = ——————}> 
(z, y) iy Pay (z, y) ya a 
(positive) (not positive) 
0.44 
4- 
(s) | 
24 
0 10 20 


() 


VII. 4.2. Combinatorial applications. Many combinatorial classes, which ad- 
mit a recursive specification of the form V = 6(Z, V), as in (43), p. 467, can be 
subjected to Theorem VII.3. The resulting structures are, to varying degrees, avatars 
of tree structures. In what follows, we describe a few instances in which the square- 
root universality holds. 


(i) Hierarchies are trees enumerated by the number of their leaves (Exam- 
ples VII.12 and VII.13). 

(ii) Trees with variable node sizes generalize simple families of trees; they oc- 
cur in particular as mathematical models of secondary structures in biology 
(Example VII. 14). 

(iii) Lattice paths with variable edge lengths are attached to some of the most 
classical objects of combinatorial theory (Note VII.19). 


Example V¥1.12. Labelled hierarchies. The class £ of labelled hierarchies, as defined in 
Note II.19, p. 128, satisfies 


L=Z+SEt32(L) => Lege a iSsk, 
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Indo-European 


Germanic Italic Armenian 


Persian 


Figure VII.11. A hierarchy placed on some of the modern Indo-European languages. 


These occur in statistical classification theory: given a collection of n distinguished items, Ly, 
is the number of ways of superimposing a non-trivial classification (cf Figure VII.11). Such 
abstract classifications usually have no planar structure, hence our modelling by a labelled set 
construction. 

In the notations of Definition VII.4, p. 467, the basic function is G(z, w) = z+e” —1-—wu, 
which is analytic in |z| < co, |w| < oo. The characteristic system is 


rte-l—-s=s, e-1=1, 


which has a unique positive solution, s = log 2, r = 2 log 2 — 1, obtained by solving the second 
equation for s, then propagating the solution to get r. Thus, hierarchies belong to the smooth 
implicit-function schema, and, by Theorem VII.3, the EGF L(z) has a square-root singularity. 
One then finds mechanically 


1 
ay = Clo Pedy, 
n! Vans 


(The unlabelled counterpart is the object of Note VII.23, p. 479.) ........ 0... cece eee eee | 


> VIL.18. The degree profile of hierarchies. Combining BGF techniques and singularity anal- 
ysis, it is found that a random hierarchy of some large size n has on average about 0.57n nodes 
of degree 2, 0.18” nodes of degree 3, 0.04n nodes of degree 4, and less than 0.01n nodes of 
degree 5 or higher. dq 


Example V1.13. Trees enumerated by leaves. For a (non-empty) set Q C Zyo that does not 
contain 0,1, it makes sense to consider the class of labelled trees, 


C= Z+ SEQQ(C) or C= Z+SETQ(C). 


(A similar discussion can be conducted for unlabelled plane trees, with OGFs replacing EGFs.) 
These are rooted trees (plane or non-plane, respectively), with size determined by the number 
of leaves and with degrees constrained to lie in Q. The EGF is then of the form 


C(z) =z+n(C(z)). 


This variety of trees includes the labelled hierarchies, which correspond to n(w) = e” —1—w. 

Assume for simplicity 7 to be entire (possibly a polynomial). The basic function is 
G(z, w) = z+(w), and the characteristic system is s = r + n(s), 7’(s) = 1. Since 7/(0) = 0 
and y'(+00) = +00, this system always has a solution: 


s= yay, r=s-—7(s). 
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ee A fragment of RNA is, in first approximation, a tree- 
like structure with edges corresponding to base pairs 
is and “loops” corresponding to leaves. There are con- 


7) and length of edges (here between | and 4 base pairs). 
<< We model such an RNA fragment as a planted tree P at- 
SS tached to a binary tree (Y) with equations: 


(*} P=AY, Y=AY2+B, 
A=24744794 28, Bart 47476427, 


Figure VII.12. A simplified combinatorial model of RNA structures analogous to 
those considered by Waterman et al. 


SB straints on the sizes of leaves (taken here between 4 and 


Thus Theorem VII.3 applies, giving 


1 
(48) [e"1C@~—ar™, yp =f =r"), 
warn 2 


and a complete expansion can be obtained. ............. cece eee ee eee een eee enn eens | 


Example VY1.14. Trees with variable edge lengths and node sizes. Consider unlabelled plane 
trees in which nodes can be of different sizes: what is given is a set Q of ordered pairs (w, 0), 
where a value (w, 0) means that a node of degree w and size o is allowed. Simple varieties 
in their basic form correspond to o = 1; trees enumerated by leaves (including hierarchies) 
correspond to o € {0,1} with o = 1 iff w = 0. Figure VII.12 suggests the way such trees can 
model the self-bonding of single-stranded nucleic acids like RNA, according to Waterman et 
al. (336, 453, 534, 558]. Clearly an extremely large number of variations are possible. 
The fundamental equation in the case of a finite Qis 


Y(z) = P(z, Y(z)), P(z, w) := >2 27 w®, 
(@,0)EQ 
with P a polynomial. In the aperiodic case, there is invariably a formula of the form 
Yn ~ K- Atn3/2 
corresponding to the universal square-root singularity. ........... 0... e cece cece neces | 


> VIL19. Schréder numbers. Consider the class VY of unary—binary trees where unary nodes 
have size 2, while leaves and binary nodes have the usual size 1. The GF satisfies Y = z+ 


z7¥ + zY2, so that 


—z-vVl-6z4+22 
2z ; 


We have D(z) = 1+2z+627 +2223 +90 zt + 39429 +---, which is EJS A006318 (“Large 
Schréder numbers”). By the bijective correspondence between trees and lattice paths, 2,41 is 
in correspondence with excursions of length n made of steps (1, 1), (2, 0), (1, —1). Upon tilting 
by 45°, this is equivalent to paths connecting the lower left corner to the upper right corner of 
an (n x n) square that are made of horizontal, vertical, and diagonal steps, and never go under 


¥@)=2D@), D@)=- 
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the main diagonal. The series § = sd + D) enumerates Schréder’s generalized parenthesis 
systems (Chapter I, p. 69): S := z + $2/(1 — S), and the asymptotic formula 
1 1 —n+1/2 
Yon—1 = Sn = 5Dn—-1 ~ = (3 -2V2) 
2n—-1 n yon 1 4 Jane 
follows straightforwardly. dq 


VII. 5. Unlabelled non-plane trees and Polya operators 


Essentially all the results obtained earlier for simple varieties of trees can be ex- 
tended to the case of non-plane unlabelled trees. Pdlya operators are central, and 
their treatment is typical of the asymptotic theory of unlabelled objects obeying sym- 
metries (i.e., involving the unlabelled MSET, PSET, CYC constructions), as we have 
seen repeatedly in this book. 


Binary and general trees. We start the discussion by considering the enumer- 
ation of two classes of non-plane trees following Pdlya [488, 491] and Otter [466], 
whose articles are important historic sources for the asymptotic theory of non-plane 
tree enumeration—a brief account also appears in [319]. (These authors used the 
more traditional method of Darboux instead of singularity analysis, but this distinc- 
tion is immaterial here, as calculations develop under completely parallel lines under 
both theories.) The two classes under consideration are those of general and binary 
non-plane unlabelled trees. In both cases, there is a fairly direct reduction to the enu- 
meration of Cayley trees and of binary trees, which renders explicit several steps of 
the calculation. The trick is, as usual, to treat values of f(z”), f(z), ..., arising from 
Pélya operators, as “known” analytic quantities. 


Proposition VII.5 (Special unlabelled non-plane trees). Consider the two classes of 
unlabelled non-plane trees 

H=ZxMSET(H), W=2Z-x MSET 0,2} (W), 
respectively, of the general and binary type. Then, with constants yy, Ay and yw, Aw 
given by Notes VII.21 and VII.22, one has 


YH n 


yw 
=A Won-1 ~ 


(49) An ~ 7 We 
m7 aN 


Proof. (i) General case. The OGF of non-plane unlabelled trees is the analytic solu- 
tion to the functional equation 


H(z) H(z?) 


(50) H(z) =z o0( 42 + 


Let T be the solution to 
(51) To =e, 


that is to say, the Cayley function. The function H(z) has a radius of convergence p 
strictly less than 1 as its coefficients dominate those of T(z), the radius of convergence 
of the latter being exactly e~! + 0.367. The radius p cannot be 0 since the number of 
trees is bounded from above by the number of plane trees whose OGF has radius 1/4. 
Thus, one has 1/4 < p < el, 
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Rewriting the defining equation of H(z) as 


2 +3 


2 3 
HGy=ret with c =z00( 45 ) ct) w), 


we observe that ¢ = ¢(z) is analytic for |z| < p!/?; that is, ¢ is analytic in a disc that 


properly contains the disc of convergence of H(z). We may thus rewrite H(z) as 


A(z) =TC@)). 
Since ¢(z) is analytic at z = p, a singular expansion of H(z) near z = p results from 
composing the singular expansion of T at e~! with the analytic expansion of ¢ at p. 
In this way, we get: 


1/2 
(52) A(z)=1-y (: = :) +0 (1 7 =)) , y = v2epC"(p). 


Thus, 
Y —n 


p 
Warns 


(ii) Binary case. Consider the functional equation 


(53) f@=2t 5fO + 5/0). 


This enumerates non-plane binary trees with size defined as the number of external 
nodes, so that W(z) = i f (2’). Thus, it suffices to analyse [z”] f(z), which dispenses 
us from dealing with periodicity phenomena arising from the parity of n. 

The OGF f(z) has a radius of convergence p that is at least 1/4 (since there are 
fewer non-plane trees than plane ones). It is also at most 1/2, which is seen from a 


comparison of f with the solution to the equation g = z+ 5 g?. We may then proceed 
1/2 


[c"]H(z) ~ 


as before: treat the term 5 f(z?) as a function analytic in |z| < p 
known, then solve. To this effect, set 


, as though it were 


1 
C(z) = z+ xf’), 
which exists in |z| < p!/?. Then, the equation (53) becomes a plain quadratic equa- 
tion, f=C+ sf, with solution 


f@) =1—V1— 2¢(). 
The singularity p is the smallest positive solution of ¢(p) = 1/2. The singular expan- 
sion of f is obtained by combining the analytic expansion of ¢ at p with ./1 — 2¢. 
The usual square-root singularity results: 


fEO~l-yvi—-z/p, yp = V2 pC"(p). 
This induces the p~"n7>/? form for the coefficients [z”] f (z) = [z2”~"]W(z). | 
The argument used in the proof of the proposition may seem partly non-constructive. 
However, numerically, the values of » and y can be determined to great accuracy. 


See the notes below as well as Finch’s section on “Otter’s tree enumeration con- 
stants” [211, Sec. 5.6]. 
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> VIL.20. Complete asymptotic expansions for Hy, W2,—1. These can be determined since the 
OGFs admit complete asymptotic expansions in powers of ./1 — z/p. <i 
> VII.21. Numerical evaluation of constants I. Here is an unoptimized procedure controlled 


by a parameter m > 0 for evaluating the constants y 7, py of (49) relative to general unlabelled 
non-plane trees. 


Procedure Get_value_of_p(m : integer); 
1. Set up a procedure to compute and memorize the Hy, on demand; 
(this can be based on recurrence relations implied by H j (z); see [456]) 
2. Define f'"l(z) := i=l Anz"; 
3. Define cI (z) := zexp ( an) pfee’y); 


4. Solve numerically clnl(x) =e! forx € (0, 1) to max(m, 10) digits of accuracy; 
5. Return x as an approximation to p. 


For instance, a conservative estimate of the accuracy attained for m = 0,10,...,50 (in a few 
billion machine instructions) is: 


m=0 |m=10/]m=20 | m=30 | m=40 | m=50 
3-10-72 | 10-& | to-!! | 40-16 | 10-2! | 19-26 


Accuracy appears to be a little better than 10/2. This yields to 25D: 


p = 0.3383218568992076951961126, Ay = p~! = 2.95576528565 1994974714818, 
YH = 1.559490020374640885542206. 


The formula of Proposition VII.5 estimates Hq with a relative error of 1073. dq 
> VII.22. Numerical evaluation of constants II. The procedure of the previous note adapts 
easily to binary trees, giving: 


p = 0.4026975036714412909690453, Aw = p—! = 2.483253536172636858562289, 
yw = 1.130033716398972007144137. 


The formula of Proposition VIL.5 estimates [z 100} f (z) with a relative error of 7 - 10-= J 


The results relative to general and binary trees are thus obtained by a modification 
of the method used for simple varieties of trees, upon treating the Pdlya operator part 
as an analytic variant of the corresponding equations of simple varieties of trees. 


Alkanes, alcohols, and degree restrictions. The previous two examples suggest 
that a general theory is possible for varieties of unlabelled non-plane trees, T = 
ZMSETQ(T), determined by some Q C Zso. First, we examine the case of spe- 
cial regular trees defined by Q = {0, 3}, which, when viewed as alkanes and alcohols, 
are of relevance to combinatorial chemistry (Example VII.15). Indeed, the problem 
of enumerating isomers of such chemical compounds has been at the origin of Pélya’s 
foundational works [488, 491]. Then, we extend the method to the general situation 
of trees with degrees constrained to an arbitrary finite set Q (Proposition VII.5). 


Example V11.15. Non-plane trees and alkanes. In chemistry, carbon atoms (C) are known 
to have valency 4 while hydrogen (H) has valency 1. Alkanes, also known as paraffins (Fig- 
ure VII.13), are acyclic molecules formed of carbon and hydrogen atoms according to this rule 
and without multiple bonds; they are thus of the type C, 42,42. In combinatorial terms, we 
are talking of unrooted trees with (total) node degrees in {1,4}. The rooted version of these 
trees are determined by the fact that a root is chosen and (out)degrees of nodes lie in the set 
Q = {0, 3}; such rooted ternary trees then correspond to alcohols (with the OH group marking 
one of the carbon atoms). 
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H 


H 4H H H H OH H 
| | | | Cae as | 
| | | | It ollie. 
H H--€= €-sH H->C Ca5H H--C=>€--C-=-H 
| | | | ie ae 
| | | | | | | 
H 4H H H H H 4 


=——Q—— = 


Methane Ethane Propane Propanol 


Figure VII.13. A few examples of alkanes (C Hy, C2 Hg, C3 Hg) and an alcohol. 


Alcohols (A) are the simplest to enumerate, since they correspond to rooted trees. The 
OGF starts as (EZS A000598) 


AQ) =14724 24242244427 48294 1727 +3928 +8979 +-.., 
with size being taken here as the number of internal nodes. The specification is 
A = {ce} + ZMSET3(A). 
(Equivalently At := A \ {e} satisfies AF = ZMSETo,1,2,3(A*).) This implies that A(z) 
satisfies the functional equation: 
1 1 1 
AQ) =1+z (ae + sA@AG?) + 7A) 
In order to apply Theorem VII.3, introduce the function 


1 1 1 
(54) G(z,w)=1+z (Jac) + 5A )w + 5’) ; 


which exists in |z|] < | pli/ 2 and |w| < oo, with p the (yet unknown) radius of convergence 
of A. Like before, the Pélya terms A(z’), A(z?) are treated as known functions. By methods 
similar to those earlier in the analysis of binary and general trees, we find that the characteristic 
system admits a solution, 


r = 0.3551817423143773928, = s = 2.1174207009536310225, 


so that p = r and y(p) = s. Thus the growth of the number of alcohols is of the form 
xp—"n—3/2, with p—! = 2.81546. 
Let B(z) be the OGF of alkanes (E7S A000602), which are unrooted trees: 
BQ) =14 724 7% 434227432 452°492! + 18 223527 4.75719 4... . 


For instance, Bg = 5 because there are five isomers of hexane, Cg M14, for which chemists had 
to develop a nomenclature system, interestingly enough based on a diameter of the tree: 


Hexane 3-Methylpentane 2-Methylpentane 
CHE CH; 


CH ;—CH »—CH , —CH ,— CH ,—CH 3 CH3—CH4—CH —CH .—CH 3 CH 3—CH —CH .—CH,—CH 


2,3-Dimethylbutane 2,2-Dimethylbutane 
CH; 
CH; CH; CH ;—C —CH y—CH 


CH,;—CH —CH —CH; cH, 
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The number of structurally different alkanes can then be found by an adaptation of the 
dissimilarity formula (Equation (57) below and Note VII.26). This problem has served as a 
powerful motivation for the enumeration of graphical trees and its fascinating history goes back 
to Cayley. (See Rains and Sloane’s article [502] and [491]). The asymptotic formula of (un- 
rooted) alkanes is of the global form pon! 2| which represents roughly a proportion 1/n of 
the number of (rooted) alcohols: see below. .......... 0. ccc ccc cee nent e ee eeees |_| 


The pattern of analysis should by now be clear, and we state: 


Theorem VII.4 (Non-plane unlabelled trees). Let Q 3 0 be a finite subset of Zo 
and consider the variety V of (rooted) unlabelled non-plane trees with outdegrees of 
nodes in Q. Assume aperiodicity (gcd(Q) = 1) and the condition that Q contains at 
least one element larger than 1. Then the number of trees of size n in V satisfies an 
asymptotic formula: 
Vi ~ C- An 3”, 

Proof. The argument given for alcohols is transposed verbatim. Only the existence of 
a root of the characteristic system needs to be established. 

The radius of convergence of V(z) is a priori < 1. The fact that p is strictly less 
than 1 is established by means of an exponential lower bound; namely, V, > B”, for 
some B > | and infinitely many values of n. To obtain this “exponential diversity” of 
the set of trees, first choose an no such that V;,, > 1, then build a perfect d—ary tree 
(for some d € Q, d ¥ 0, 1) of height h, and finally graft freely subtrees of size no at 
n/(4ng) of the leaves of the perfect tree. Choosing d such that d’ > n/(4ng) yields 
the lower bound. That the radius of convergence is non-zero results from the upper 
bound provided by corresponding plane trees whose growth is at most exponential. 
Thus, one has 0 < p < 1. 

By the translation of multisets of bounded cardinality, the function G is polyno- 
mial in finitely many of the quantities {V(z), V(z7), ...}. Thus the function G(z, w) 
constructed as in the case of alcohols, in Equation (54), converges in |z| < p!/?,|w| < 
oo. As z > p7!, we must have rt := V(p) finite, since otherwise, there would be a 
contradiction in orders of growth in the nonlinear equation V(z) = ---+---V(z)¢--- 
as z > p. Thus (p,7) satisfies t = G(p,t). For the derivative, one must have 
Gw(p,t) = 1 since: () a smaller value would mean that V is analytic at p (by the 
Implicit Function Theorem); (ii) a larger value would mean that a singularity has 
been encountered earlier (by the usual argument on failure of the Implicit Function 
Theorem). Thus, Theorem VII.3 on positive implicit functions is applicable. a 


A large number of variations are clearly possible as evidenced by the sugges- 
tive title of an article [320] published by Harary, Robinson, and Schwenk in 1975: 
“Twenty-step algorithm for determining the asymptotic number of trees of various 
species”. 
> VII.23. Unlabelled hierarchies. The class H of unlabelled hierarchies is specified by H = 
Z + MSET32(H); see Note 1.45, p. 72. One has 
Y 


2 an 


(Compare with the labelled case of Example VII.12, p. 472.) What is the asymptotic proportion 
of internal nodes of degree r, for a fixed r > 0? dq 


Hy ~ p”, p = 0.29224. 
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> VIL.24. Trees with prime degrees and the BBY theory. Bell, Burris, and Yeats [33] develop 
a general theory meant to account for the fact that, in their words, “almost any family of trees 
defined by a recursive equation that is nonlinear [...] lead[s] to an asymptotic law of the Polya 


form t(n) ~ Co~"n-3/ 2” Their most general result [33, Th. 75] implies for instance that the 
number of unlabelled non-plane trees whose node degrees are restricted to be prime numbers 
admits such a Polya form (see also Note VIL6, p. 455). dq 


Unlabelled functional graphs (mapping patterns). Unlabelled functional graphs 
(named “functions” in [319, pp. 69-70]) are denoted here by F; they correspond to 
unlabelled digraphs with loops allowed, in which each vertex has outdegree equal to 1. 
They can be specified as multisets of components (£) that are cycles of non-plane 
unlabelled trees (H), 


F=MSET(L); L£=Cyc(H); H= Zx MSET(H), 


a specification that entirely parallels that of mappings in Equation (35), p. 462. Indeed, 
an unlabelled functional graph can be used to represent the “shape” of a mapping, as 
obtained when labels are discarded. That is, functional graphs result when mappings 
are identified up to a possible permutation of their underlying domain. This explains 
the alternative term of “mapping pattern” [436] sometimes employed for such graphs. 
The counting sequence starts as 1, 1,3, 7, 19, 47, 130, 343, 951 (EZS A001372). 

The OGF H(z) has a square-root singularity by virtue of (52) above, with addi- 
tionally H(p) = 1. The translation of the unlabelled cycle construction, 


(i) 1 
HOD jo eS: 
JZ 


implies that L(z) is logarithmic, and F(z) has a singularity of type 1/./Z where Z := 
1 — z/p. Thus, unlabelled functional graphs constitute an exp—log structure in the 
sense of Section VII. 2, p. 445, with « = 1/2. The number of unlabelled functional 
graphs thus grows like Cp~"n7!/* and the mean number of components in a random 
functional graph is ~ 5 log n, as for labelled mappings; see [436] for more on this 
topic. 

> VIL.25. An alternative form of F(z). Arithmetical simplifications associated with the Euler 
totient function (APPENDIX A, p. 721) yield: 


ro=T] (1-H). 


k=1 

A similar form applies generally to multisets of unlabelled cycles (Note I.57, p. 85). dq 

Unrooted trees. All the trees considered so far have been rooted and this version 
is the one most useful in applications. An unrooted tree? is by definition a connected 
acyclic (undirected) graph. In that case, the tree is clearly non-plane and no special 
root node is distinguished. 

The counting of the class U/ of unrooted labelled trees is easy: there are plainly 
U, = n"~? of these, since each node is distinguished by its label, which entails that 


*Unrooted trees are also called sometimes free trees. 
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nUy = Ty, With Ty =n"! by Cayley’s formula. Also, the EGF U (z) satisfies 


é d 1 
(55) ue i To) S =7@)- 57, 


as already seen when we discussed labelled graphs in Subsection II. 5.3, p. 132. 

For unrooted unlabelled trees, symmetries are present and a tree can be rooted 
in a number of ways that depends on its shape. For instance, a star graph leads to a 
number of different rooted trees that equals 2 (choose either the centre or one of the 
peripheral nodes), while a line graph gives rise to [n/2] structurally different rooted 
trees. With 7 the class of rooted unlabelled trees and Z the class of unrooted trees, 
we have at this stage only a general inequality of the form 


In < Hy < nn. 


A table of values of the ratio H,,/J, suggests that the answer is close to the upper 
bound: 


(56) 


n 10 20 30 40 50 60 
An/In | 6.78 15.58 23.89 32.15 40.39 48.62 


The solution is provided by a famous exact formula due to Otter (Note VII.26): 
1 
(57) 1@) =H@ — 5 (H@- He), 


which gives in particular (EJS A000055) (z) = z+ 27 +2°4+224432 462° + 
11z’ +2378 4.--. Given (57), it is child’s play to determine the singular expansion 
of J knowing that of H. The radius of convergence of J is the same as that of H, since 
the term H(z”) only introduces exponentially small coefficients. Thus, it suffices to 
analyse H — 5H: 


! O52) 4 3/2 2 = £ 
H(z) — 5H (2) OL + BZ +0(2"), Z=({1-—-). 


What is noticeable is the cancellation in coefficients for the term Z!/2 (since 1 — x — 
5(1 -xyv= 5 + O(x7)), so that Z7/* is the actual singularity type of J. Clearly, 
the constant 63 is computable from the first four terms in the singular expansion of H 
at p. Then singularity analysis yields: The number of unrooted trees of size n satisfies 
the formula 


(58) In ~ ar"; In ~ (0.5349496061 . . .) (2.9955765856...)"n—>/?. 

4 
The numerical values are from [211] and the result is Otter’s original [466]: an un- 
rooted tree of size n gives rise to about different 0.8n rooted trees on average. (The 
formula (58) corresponds to an error slightly under 10~7 for n = 100.) 


> VII.26. Dissimilarity theorem for trees. | Here is how combinatorics justifies (57), follow- 
ing [50, §4.1]. Let Z* (and Z**) be the class of unrooted trees with one vertex (respectively, one 
edge) distinguished. We have Z* = H (rooted trees) and Z** = SET2(H). The combinatorial 
isomorphism claimed is 


(59) T4+T°>=T4+ (x7). 


Proof. A diameter of an unrooted tree is a simple path of maximal length. If the length of 
any diameter is even, call “centre” its mid-point; otherwise, call “bicentre” its mid-edge. (For 
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each tree, there is either one centre or one bicentre.) The left-hand side of (59) corresponds to 
trees that are pointed either at a vertex (Z*) or an edge (Z**). The term Z on the right-hand 
side corresponds to cases where the pointing happens to coincide with the canonical centre or 
bicentre. If there is not coincidence, then, an ordered pair of trees results from a suitable surgery 
of the pointed tree. [Hint: cut in some canonical way near the pointed vertex or edge.] J 


VII. 6. Irreducible context-free structures 


In this section, we discuss an important variety of context-free classes, one that 
gives rise to the universal law of square-root singularities, itself attached to count- 
ing sequences that are of the general asymptotic form A’n~*/*. First, we enunciate 
an abstract structural result (Theorem VII.5, p. 483) that connects “irreducibility” of 
context-free systems to the square-root singularity phenomenon. Before engaging into 
a proof, we first illustrate its scope by describing applications to non-crossing configu- 
rations in the plane (these are richer than triangulations introduced in Chapter I) and to 
random boolean expressions. Finally, we prove an important complex analytic result, 
the Drmota—Lalley—Woods Theorem (Theorem VII.6, p. 489), which provides the un- 
derlying analytic engine needed to establish Theorem VII.5 and justify the asymptotic 
properties of irreducible context-free specifications. General algebraic functions are 
to be treated next, in Section VII. 7, p. 493. 


VII. 6.1. Context-free specifications and the irreducibility schema. We start 
from the notion of a context-free class already introduced in Subsection I. 5.4, p. 79, 
which we recall: a class is context-free if it is determined as the first component of a 
system of combinatorial equations 


VW = BilZ,N,...,H) 
(60) a 
Yo = Br(Z, Vi, -. 5 Vr), 
where each §; is a construction that only involves the combinatorial constructions of 
disjoint union and cartesian product. (This repeats Equation (83) of Chapter I, p. 79.) 
As seen in Subsection I. 5.4, binary and general trees, triangulations, as well as Dyck 
and Lukasiewicz languages are typical instances of context-free classes. 

As a consequence of the symbolic rules of Chapter I, the OGF of a context-free 
class C is the first component (C(z) = y1(z)) of the solution of a polynomial system 
of equations of the form 


yi (z) ®1(z, yi(z),..-, ¥(Z)) 
(61) ; : : 
yr(z) = ®,(z, y1(z),---5 yr (Z))5 


where the ® ; are polynomials. By elimination (Cf Appendix B.1: Algebraic elimina- 
tion, p. 739), itis always possible to find a bivariate polynomial P(z, y) such that 


(62) P(z, C(z)) = 0, 


and C(z) is an algebraic function. (Algebraic functions are discussed in all generality 
in the next section.) 
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The case of linear systems has been dealt with in Chapter V, when examining 
the transfer matrix method. Accordingly, we only need to consider here nonlinear 
systems (of equations or specifications) defined by the condition that at least one ®; 
in (61) is a polynomial of degree 2 or more in the y;, corresponding to the fact that at 
least one of the constructions § ; in (60) involves at least a product Vp. 


Definition VII.5. A context-free specification (60) is said to belong to the irreducible 
context-free schema if it is nonlinear and its dependency graph (p. 33) is strongly 
connected. It is said to be aperiodic if all the y ;(z) are aperiodic”. 


Theorem VILI.5 (Irreducible context-free schema). A class C that belongs to the irre- 
ducible context-free schema has a generating function that has a square-root singu- 
larity at its radius of convergence p: 


C()=t-y fi-=+o0(1-2), 
p p 


for computable algebraic numbers p,t, y. If, in addition, C(z) is aperiodic, then the 
dominant singularity is unique and the counting sequence satisfies 


(63) Cy ~ — =p 

- Win 

This theorem is none other than a transcription, at the combinatorial level, of a 

remarkable analytic statement, Theorem VII.6, due to Drmota, Lalley, and Woods, 
which is proved below (p. 489), is slightly stronger, and is of independent interest. 


Computability issues. There are two complementary approaches to the calcula- 
tion of the quantities that appear in (63), one based on the original system (61), the 
other based on the single equation (62) that results from elimination. We offer at this 
stage a brief pragmatic discussion of computational aspects, referring the reader to 
Subsection VII. 6.3, p. 488, and Section VII. 7, p. 493, for context and justifications. 


(a) System: Considering the proof of Theorem VII.6 below, one should solve, in 
positive real numbers, a polynomial system of m+ 1 equations in the m+ 1 unknowns 
P, T1,--+5 Tm; namely, 


TI _ ®1(p, T1,..-, Tm) 
(64) 

Tm = Om(p,T1,---5 Tm) 

0 = J(P,T1,+++5Tm)s 


which one can call the characteristic system. There J is the Jacobian determinant: 


0 
(65) J(z, Vis-++>¥m) = det (4. = ay, Yi; oe) > 
J 


10an aperiodic function is such that the span of the coefficient sequence is equal to 1 (Definition IV.5, 
p. 266). For an irreducible system, it can be checked that all the y; are aperiodic if and only if at least one 
of the y; is aperiodic. 
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with 0;,; = [i = j]] being the usual Kronecker symbol. The quantity p represents the 
common radius of convergence of all the y;(z) and t; = y;(p). (In case several pos- 
sibilities present themselves for p, as in Note VII.28, then one can use either a priori 
combinatorial bounds to filter out the spurious ones!! or make use of the reduction to a 
single equation as in point (b) below.) The constant y = y; in Theorem VII.5 is then 
a component of the solution to a linear system of equations (with coefficients in the 
field generated by p, t;) and is obtained by the method of undetermined coefficients, 
since each yj; is of the form 


(66) Vi) a eps. Ae pe 


(b) Equation: The general techniques are going to be described in Section, § VII. 7, 
p. 493. They give rise to the following algorithm: (i) determine the exceptional set, 
identify the proper branch of the algebraic curve and the dominant positive singularity; 
(ii) determine the coefficients in the singular (Puiseux) expansion, knowing a priori 
that the singularity is of the square-root type. 


In all events, symbolic algebra systems prove invaluable in performing the re- 
quired algebraic eliminations and isolating the combinatorially relevant roots (see, in 
particular, Pivoteau ef al. [485] for a general symbolic-numeric approach). Exam- 
ple VII.16 serves to illustrate some of these computations. 


> VIIL.27. Catalan and the Jacobian determinant. For the Catalan GF, defined by y = 1+ zy’, 
the characteristic system (64) instantiates to 


t—l—pt?=0, 1-—2pr=0, 
giving back as expected: p = ie C2, dq 
> VII.28. Burris’ Caveat. As noted by Stanley Burris (private communication), even some 


very simple context-free specifications may be such that there exist several positive solutions to 
the characteristic system (64). Consider 


y1 z(1+y2 +7) 


(B): 


y2 z(l+y1 +93), 


which is clearly associated to a redundant way of counting unary—binary trees (via a determin- 
istic 2-colouring). The characteristic system is 


{u=pltn4e7), m=pltr+e3), (=2pr)(— 2pm) - p? =o}. 


The positive solutions are 


1 1 ~ 
[p= 5. u=a=1 U [p= 7@v2- 0. aaa 2+. 


Only the first solution is combinatorially significant. (A somewhat similar situation, though it 
relates to a non-irreducible context-free specification, arises with supertrees of Example VII.20, 
p. 503: see Figure VIL19, p. 504.) dq 


'I This is once more a connection problem, in the sense of p. 470. 
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VII. 6.2. Combinatorial applications. Lattice animals (Example I.18, p. 80), 
random walks on free groups [395], directed walks in the plane (see references [27, 
392, 395] and p. 506 below), coloured trees [616], and boolean expression trees (refer- 
ence [115] and Examples VII.17) are only some of the many combinatorial structures 
belonging to the irreducible context-free schema. Stanley presents in his book [554, 
Ch. 6] several examples of algebraic GFs, and an inspiring survey is provided by 
Bousquet-Mélou in [84]. We limit ourselves here to a brief discussion of non-crossing 
configurations and random boolean expressions. 


Example V11.16. Non-crossing configurations. Context-free descriptions can model naturally 
very diverse sorts of objects including particular topological-geometric configurations—we ex- 
amine here non-crossing planar configurations. The problems considered have their origin in 
combinatorial musings of the Rev. T.P. Kirkman in 1857 and were revisited in 1974 by Domb 
and Barett [169] for the purpose of investigating certain perturbative expansions of statistical 
physics. Our presentation follows closely the synthesis offered by Flajolet and Noy in [245]. 

Consider, for each value of n, graphs built on vertices that are all the nth complex roots 
of unity, numbered 0,...,2 — 1. A non-crossing graph is a graph such that no two of its 
edges cross. One can also define connected non-crossing graphs, non-crossing forests (acyclic 
graphs), and non-crossing trees (acyclic connected graphs); see Figure VII.14. Note that the 
various graphs considered can always be considered as rooted in some canonical way (e.g., at 
the vertex of smallest index) . 


Trees. A non-crossing tree is rooted at 0. To the root vertex is attached an ordered collec- 
tion of vertices, each of which has an end-node v that is the common root of two non-crossing 
trees, one on the left of the edge (0, v) the other on the right of (0, v). Let J denote the class 
of trees and U/ denote the class of trees whose root has been severed. With e = Z denoting a 
generic node, we have 


T=exl, U=SEQUxexl), 


which corresponds graphically to the “butterfly decomposition”: 


T= @6 U= 
U 


The reduction to a pure context-free form is obtained by noticing that / = SEQ(V) is 
equivalent to’/ = 1+ UV: a specification and the associated polynomial system are then 


(67) {T= ZU, U=14UV, V=ZUUy => = {[T=W, U=14UV, V =wU"}. 


This system relating U and V is irreducible (then, T is immediately obtained from U), and 
aperiodicity is obvious from the first few values of the coefficients. The Jacobian (65) of the 
{U, V}-system (obtained by z > p, U > v, V > ), is 


| 1-B ov 


a ae 2 
one. iA =1-—f£-2pv". 


Thus, the characteristic system (64) giving the singularity of U, V is 


{o = 1+ of, B=po’, LPS 900° = 0}, 
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Figure VII.14. (Top) Non-crossing graphs: a tree, a forest, a connected graph, and a 
general graph. (Bottom) The enumeration of non-crossing configurations by algebraic 
functions. 
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whose positive solution is p = os v= 3, p= 4. The complete asymptotic formula is 
displayed in Figure VII.14. (In a simple case like this, we have more: T satisfies T3-2T+22 = 
0, which, by Lagrange inversion, gives T, = wit (ss) 


Forests. A (non-crossing) forest is a non-crossing graph that is acyclic. In the present con- 
text, it is not possible to express forests simply as sequences of trees, because of the geometry of 
the problem. Starting conventionally from the root vertex 0 and following all connected edges 
defines a “backbone” tree. To the left of every vertex of the tree, a forest may be placed. There 
results the decomposition (expressed directly in terms of OGFs) 


(68) FH=14T7[zb ZF], 


where T is the OGF of trees and F is the OGF of forests. In (68), the term T[z +> zF' denotes 
a functional composition. A context-free specification in standard form results mechanically 
from (67) upon replacing z by zF: 


(69) {F=1+T, T=zFU, U=1+UV, V=zFU"}. 


This system is irreducible and aperiodic, so that the asymptotic shape of Fy is a priori of the 
form y a!n~3/? according to Theorem VII.5. The characteristic system is found to have three 
solutions, of which only one has all its components positive, corresponding to p = 0.12158, a 
root of the cubic equation 5 p> - 8p — 32p +4 = 0. (The values of constants are otherwise 
worked out in Example VII.19, p. 502, by means of the equational approach.) 


Graphs. Similar constructions (see [245]) give the OGFs of connected and general graphs, 
with the results tabulated in Figure VII.14. In summary: 


Proposition VII.6. The number of non-crossing trees, forests, connected graphs, and graphs 
each satisfy an asymptotic formula of the form 


Cc 


mn> 


an 


The common shape of the asymptotic estimates is worthy of note, as is the fact that bino- 
mial expressions are available in each particular case (Note VII.34, p. 495, introduces a general 
framework that “explains” the existence of such binomial expressions). ................++ | 


Example VII.A7. © Random boolean expressions. | We reconsider boolean expressions in 
the form of and-or trees introduced in Example I.15, p. 69, in connection with Hipparchus of 
Rhodes and Schréder, and in Example 1.17, p. 77. Such an expression is described by a binary 
tree whose internal nodes can be tagged with “v” (or-function) or “A” (and-function); external 
nodes are formal variables and their negations (“literals”). We fix the number of variables to 
some number m. The class € of all such boolean expressions satisfies a symbolic equation of 
the form 


Vv A ey 
ee aed Yet Bb Es )3 


Size is taken to be the number of internal (binary) nodes; that is, the number of boolean con- 

nectives. Each boolean expression given in the form of such an and-or tree represents a certain 

boolean function of m variables, among the 22" functions. The corresponding OGF and coeffi- 

cients are 

1— J/1 = 16mz 
4z : 

the radius of convergence of E(z) being p = 1/(16m). 


En = (2"1E(@) = 2" amy! —_ (7) ~ 216m)", 


E = 
@) n+1\n Nn 
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Our purpose is to establish the following result due to Lefmann and Savicky [405], our line 
of proof following [115]. 
Proposition VII.7. Let f be a boolean function of m variables (m fixed). Then the probability 
that a random and-or formula of size n computes f converges, as n tends to infinity, to a 
constant value w(f) # 0. 
Proof. Consider, for each f, the subclass Vy C € of expressions that compute f. We thus 
have 2” such classes. It is then immediate to write combinatorial equations describing the V¢, 
by considering all the ways in which a function f can arise. Indeed, if f is not a literal, then 

Vv Vv 


y= DD levy + Dv Y 
(evin=s 8 Yh (eaiyas %8 Yh 
while, if f = x; (say), then 
= Vv Vv 
Vp = |xp) t+ ye YN + S “ZN 
(evin=p %8 Mh gama Ee 


Thus, at generating function level, we have a system of 2 polynomial equations. This system 
is irreducible: given two functions f and g represented by ® and I (say), we can always 
construct an expression for f involving the expression I’ by building a tree of the form 


(® A (True Vv T)) = (®A (XV 7x1) VT)). 
Thus any + depends on any other Vg. Similar arguments, based on the fact that 
True = (True A True) = (True A True A True) = --- , 


with “True” itself representable as (x1 V 7x1) = ((x1 A x1) V 7x1) = ---, guarantee aperi- 
odicity. Thus Theorem VIL.5 applies: the VY all have the same radius of convergence, and that 
radius must be equal to that of E(z) (namely p = 1/(16m)), since € = >> f Y-z. Thereby the 
proposition is established. | 

It is an interesting and largely open problem to characterize the relation between the limit 
probability w(f) of a function f and its structural complexity. At least, the cases m = 1, 2,3 
can be solved exactly and numerically: it appears that functions of low complexity tend to occur 
much more frequently, as shown by the data of [115]. ......... 0... eee eee ee | 


VII. 6.3. The analysis of irreducible polynomial systems. The analytic engine 
behind Theorem VII.5 is a fundamental result, the “Drmota—Lalley-Woods” (DLW) 
Theorem, due to independent research by several authors: Drmota [172] developed a 
version of the theorem in the course of studies relative to limit laws in various families 
of trees defined by context-free grammars; Woods [616], motivated by questions of 
boolean complexity and finite model theory, gave a form expressed in terms of colour- 
ing rules for trees; finally, Lalley [395] came across a similarly general result when 
quantifying return probabilities for random walks on groups. Drmota and Lalley show 
how to pull out limit Gaussian laws for simple parameters (by a perturbative analysis; 
see Chapter IX); Woods shows how to deduce estimates of coefficients even in some 
periodic or non-irreducible cases. 

In the treatment that follows we start from a polynomial system of equations, 


1 = OF, iets j=l,...,m, 
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in accordance with the notations adopted at the beginning of the section. We only 
consider nonlinear systems defined by the fact that at least one polynomial © ; is non- 
linear in some of the indeterminates y1, ..., ym. (Linear systems have been discussed 
extensively in Chapter V.) 
For applications to combinatorics, we define four possible attributes of a polyno- 
mial system. The first one is a natural positivity condition. 
(i) Algebraic positivity (or a-positivity). A polynomial system is said to be a- 
positive if all the component polynomials ® ; have non-negative coefficients. 
Next, we want to restrict consideration to systems that determine a unique so- 
lution vector (y1,...,¥m) € (C[[z])”. Define the z-valuation val(y) of a vector 
y € C[[z]!” as the minimum over all j’s of the individual valuations!* val(y j). The 
distance between two vectors is defined as usual by d(u, 0) = 2— val) Then: 


(ii) Algebraic properness (or a-properness). A polynomial system is said to be 
a-proper if it satisfies a Lipschitz condition 


d(®(¥), B(’)) < Kd(y, y’) forsome K <1. 


In that case, the transformation ® is a contraction on the complete metric space of 
formal power series and, by the general fixed point theorem, the equation y = O(y) 
admits a unique solution. This solution may be obtained by the iterative scheme, 


I 0p). SOO) im 
h->oo 


in accordance with our discussion of the semantics of recursion, on p. 31. 

The key notion is irreducibility. To a polynomial system, y = ®(¥), associate its 
dependency graph defined in the usual way as a graph whose vertices are the numbers 
1,...,m and the edges ending at a vertex j are k — j, if y; figures in a monomial of 
Ox. 

(iii) Algebraic irreducibility (or a-irreducibility). A polynomial system is said to 

be a-irreducible if its dependency graph is strongly connected. 
(This notion matches that of Definition VII.5, p. 483.) 
Finally, one needs the usual technical notion of aperiodicity: 
(iv) Algebraic aperiodicity (or a-aperiodicity). A proper polynomial system is 
said to be aperiodic if each of its component solutions y; is aperiodic in the 
sense of Definition IV.5, p. 266. 
We can now state: 
Theorem VII.6 (Irreducible positive polynomial systems, DLW Theorem). Consider 
anonlinear polynomial system y = (jy) that is a-positive, a-proper, and a-irreducible. 
Then, all component solutions y; have the same radius of convergence p < 00, and 
there exist functions hj; analytic at the origin such that, in a neighbourhood of p: 


(70) yj =hj (VT =2/p). 


127 et f= Dep fnz" with fg A Oand fo =--- = fg—1 = 0; the valuation of f is by definition 
val(f) = £; see Appendix A.5: Formal power series, p. 730. 


490 VII. APPLICATIONS OF SINGULARITY ANALYSIS 


In addition, all other dominant singularities are of the form po with @ a root of unity. 
If furthermore the system is a-aperiodic, all y; have p as unique dominant singularity. 
In that case, the coefficients admit a complete asymptotic expansion, 


(71) ["ly;@) ~p" | > dn 2? * |, 
k>0 


for computable dx. 


Proof. The proof consists in gathering by stages consequences of the assumptions. 
It is essentially based on a close examination of “failure” of the multivariate implicit 
function theorem and the way this situation leads to square-root singularities. 


(a) As a preliminary observation, we note that each component solution y; is an 
algebraic function that has a non-zero radius of convergence. This can be checked 
directly by the method of majorant series (Note IV.20, p. 250), or as a consequence 
of the multivariate version of the implicit function theorem (Appendix B.5: Implicit 
Function Theorem, p. 753). 


(b) Properness together with the positivity of the system implies that each y;(z) 
has non-negative coefficients in its expansion at 0, since it is a formal limit of ap- 
proximants that have non-negative coefficients. In particular, by positivity, p; is a 
singularity of y; (by virtue of Pringsheim’s theorem). From the known nature of sin- 
gularities of algebraic functions (e.g., the Newton—Puiseux Theorem, p. 498 below), 
there must exist some order R > 0 such that each Rth derivative a y;(z) becomes 
infinite as z > pj. 

We establish now that p} = --- = pm. In effect, differentiation of the equations 
composing the system implies that a derivative of arbitrary order r, 07 y;(z), is a linear 
form in other derivatives 0! y;(z) of the same order (and a polynomial form in lower 
order derivatives); also the linear combination and the polynomial form have non- 
negative coefficients. Assume a contrario that the radii were not all equal, say pj = 
+++ = ps, with the other radii ps41,... being strictly greater. Consider the system 
differentiated a sufficiently large number of times, R. Then, as z > 1, we must have 
ak yj; tending to infinity for j < s. On the other hand, the quantities y,+1, etc., being 
analytic, their Rth derivatives that are analytic as well must tend to finite limits. In 
other words, because of the irreducibility assumption (and again positivity), infinity 
has to propagate and we have reached a contradiction. Thus: all the y; have the same 
radius of convergence. We let p denote this common value. 


(c,) The key step consists in establishing the existence of a square-root singularity 
at the common singularity p. Consider first the scalar case, that is 


(72) y—(z, y) =0, 


where ¢ is assumed to be a nonlinear polynomial in y and have non-negative coeffi- 
cients. This case belongs to the smooth implicit function schema, whose argument we 
briefly revisit under our present perspective. 

Let y(z) be the unique branch of the algebraic function that is analytic at 0. Com- 
parison of the asymptotic orders in y inside the equality y = ¢(z, y) shows that (by 
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nonlinearity) we cannot have y — oo when z tends to a finite limit. Let now p be the 
radius of convergence of y(z). Since y(z) is necessarily finite at its singularity p, we 
set t = y(p) and note that, by continuity, 7 — d(p, tT) = 0. 

By the implicit function theorem, a solution (zo, yo) of (72) can be continued 
analytically as (z, yo(z)) in the vicinity of zg as long as the derivative with respect to y 
(the simplest form of a Jacobian), 


J (z0, Yo) = 1 — $5,(Zo, yo), 


remains non-zero. The quantity p being a singularity, we must thus have J(p, 7) = 0. 
On the other hand, the second derivative — ay is non-zero at (p, Tt) (by nonlinearity 
and positivity). Then, the local expansion of the defining equation (72) at (p, 7) binds 


(z, y) locally by 


— (= PILL.) = 50-9. + =O, 
implying the singular expansion 
y-t=—y(l—2z/p)?+---. 
This establishes the first part of the assertion in the scalar case. 


(c2) In the multivariate case, we graft Lalley’s ingenious argument [395] that is 
based on a linearized version of the system to which Perron—Frobenius theory is appli- 
cable. First, irreducibility implies that any component solution y; depends positively 
and nonlinearly on itself (by possibly iterating ®), so that a contradiction in asymp- 
totic regimes would result, if we suppose that any y; tends to infinity. Each y;(z) 
remains finite at the positive dominant singularity p. 

Now, the multivariate version of the implicit function theorem (Theorem B.6, 
p. 755) grants us locally the analytic continuation of any solution yj, y2,..., ¥m at Zo 
provided there is no vanishing of the Jacobian determinant 


a 
J (20, Yi, +++» Ym) = det (4. - By, i Mo» Viicwsy vm) 
J 


i=). 


Thus, we must have 
(73) J(p,T1,---,T) =90 where tj := yj(p). 


The next argument uses Perron—Frobenius theory (Subsection V. 5.2 and Note V.34, 
p. 345) and linear algebra. Consider the Jacobian matrix 


K(Z, V1, +65 Ym) = (Soe...) , 
Oyj i,j=l..m 

which represents the “linear part” of ®. For z, yi, ..., Ym all non-negative, the matrix 

K has positive entries (by positivity of ®) so that it is amenable to Perron—Frobenius 

theory. In particular it has a positive eigenvalue A(z, y1,..., Ym) that dominates all 

the other in modulus. The quantity 


A(z) = AG, y1(Z), «+s Ym (Z)) 
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is increasing, as it is an increasing function of the matrix entries that themselves in- 
crease with z for z > 0. 

We propose to prove that A(p) = 1, In effect, 24(p) < 1 is excluded since 
otherwise (J — K) would be invertible at z = p and this would imply J # 0, 
thereby contradicting the singular character of the y;(z) at p. Assume a contrario 
A(p) > 1 in order to exclude the other case. Then, by the monotonicity and continuity 
of A(z), there would exist p < p such that A(p) = 1. Let 0 be a left eigenvector 
of K(p, yi(P), .--, Ym(P)) corresponding to the eigenvalue 1(p). Perron—Frobenius 
theory guarantees that such a vector 0 has all its coefficients that are positive. Then, 
upon multiplying on the left by v the column vectors corresponding to y and ®(y) 
(which are equal), one gets an identity; this derived identity, upon expanding near p, 
gives 


(74) A@— 7) =— > Bj Gi@ - ¥O)Oj@—-yj@)+---, 
i,j 
where --- hides lower order terms and the coefficients A, B;,; are non-negative with 
A > 0. There is a contradiction in the orders of growth if each y; is assumed to be 
analytic at p, since the left-hand side of (74) is of exact order (z — p) while the right- 
hand side is at least as small as (z — p)*. Thus, we must have A(p) = 1 and A(x) < 1 
for x € (0, p). 
A calculation similar to (74) but with p replaced by p shows finally that, if 


yi) — yi(p) ~ vilp — 2)", 
then consistency of asymptotic expansions implies 2a = 1, that isa = 5 We have 
thus proved: All the component solutions y;(z) have a square-root singularity at p. 


(The existence of a complete expansion in powers of (p — z)!/* results from a refine- 
ment of this argument.) The proof of the general case (70) is thus complete. 


(d) In the aperiodic case, we first observe that each y;(z) cannot assume an in- 
finite value on its circle of convergence |z| = p, since this would contradict the 
boundedness of |y;(z)| in the open disc |z| < p (where y;(p) serves as an upper 
bound). Consequently, by singularity analysis, the Taylor coefficients of any y;(z) are 
O(n—!~") for some 4 > 1 and the series representing yj; at the origin converges on 
lz| = p. 

For the rest of the argument, we observe that, if y = ®(z, y), then y = ©”) (z, y) 
where the superscript denotes iteration of the transformation ® in the variables y = 
(¥1,---5¥m). By irreducibility, ®’”) is such that each of its component polynomials 
involves all the variables. 

Assume a contrario the existence of a singularity p* of some y;(z) on |z| = p. 
The triangle inequality yields |y;(p*)| < yj;(p), and the stronger form |y;(p*)| < 
y;(p) results from the Daffodil Lemma (p. 267). Then, the modified Jacobian matrix 
K™) of ®” taken at the y j(p*) has entries dominated strictly by the entries of K me 
taken at the y;(p). Therefore, the dominant eigenvalue of K (m) (z, yj (p*)) must be 
strictly less than 1. This would imply that J — K")(z, y j(p*)) is invertible so that 
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the y;(z) would be analytic at p*. A contradiction has been reached: p is the sole 
dominant singularity of each y; and this concludes the argument. a 


Many extensions of the DLW Theorem are possible, as indicated by the notes and 
references below—the underlying arguments are powerful, versatile, and highly gen- 
eral. Consequences regarding limit distributions, as obtained by Drmota and Lalley, 
are further explored in Chapter IX (p. 681). 
> VIL.29. Analytic systems. Drmota [172] has shown that the conclusions of the DLW The- 
orem regarding universality of the square-root singularity hold more generally for ® ; that are 
analytic functions of C”’*+! to C, provided there exists a positive solution of the characteris- 
tic system within the domain of analyticity of the ©; (see the original article [172] and the 
note [99] for a discussion of precise conditions). This extension then unifies the DLW theorem 
and Theorem VII.3 relative to the smooth implicit function schema. 


> VIL.30. Pélya systems. Woods [616] has shown that several systems built from Pélya opera- 
tors of the form MSET, can also be treated by an extension of the DLW Theorem, which then 
unifies this theorem and Theorem VII.4. 


> VII.31. Infinite systems. Lalley [398] has extended the conclusions of the DLW Theorem to 
certain infinite systems of generating function equations. This makes it possible to quantify the 
return probabilities of certain random walks on infinite free products of finite groups. 


The square-root singularity property ceases to be universal when the assumptions 
of Theorems VII.5 and VIL.6, in essence, positivity or irreducibility, fail to be satis- 
fied. For instance, supertrees that are specified by a positive but reducible system have 
a singularity of the fourth-root type (Example VII.10, p. 412 to be revisited in Exam- 
ple VII.20, p. 503). We discuss next, in Section VII. 7, general methods that apply to 
any algebraic function and are based on the minimal polynomial equation (rather than 
a system) satisfied by the function. Note that the results there do not always subsume 
the present ones, since structure is not preserved when a system is reduced, by elimi- 
nation, to a single equation. It would at least be desirable to determine directly, from 
a positive (but reducible) system, the type of singular behaviour of the solution, but 
the systematic research involved in such a programme is yet to be carried out. 


VII.7. The general analysis of algebraic functions 


Algebraic series and algebraic functions are simply defined as solutions of a poly- 
nomial equation or system. Their singularities are strongly constrained to be branch 
points, with the local expansion at a singularity being a fractional power series known 
as a Newton—Puiseux expansion (Subsection VII. 7.1). Singularity analysis then turns 
out to be systematically applicable to algebraic functions, to the effect that their coef- 
ficients are asymptotically composed of elements of the form 


(75) C-a"n?!/4, 5 €Q\ {-1,-2,...3, 


see Subsection VII. 7.2. This last form includes as a special case the exponent p/q = 
—3/2, that was encountered repeatedly, when dealing with inverse functions, implicit 
functions, and irreducible systems. In this section, we develop the basic structural 
results that lead to the asymptotic forms (75). However, designing effective methods 
(i.e., decision procedures) to compute the characteristic constants in (75) is not obvi- 
ous in the algebraic case. Several algorithms will be described in order to locate and 
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analyse singularities (e.g., Newton’s polygon method). In particular, the multivalued 
character of algebraic functions creates a need to solve what are known as connection 
problems. 


Basics. We adopt as the starting point of the present discussion the following 
definition of an algebraic function or series (see also Note VII.32 for a variant). 


Definition VII.6. A function f (z) analytic in a neighbourhood Y of a point Zo is said 
to be algebraic if there exists a (non-zero) polynomial P(z, y) € C[z, y], such that 


(76) P(z, f(z)) =0, zey. 


A power series f € Cz] is said to be an algebraic power series if it coincides with 
the expansion of an algebraic function at 0. 


The degree of an algebraic series or function f is by definition the minimal value 
of deg, P(z, y) over all polynomials that are cancelled by f (so that rational series 
are algebraic of degree 1). One can always assume P to be irreducible over C (that is 
P = QR implies that one of Q or R is a scalar) and of minimal degree. 

An algebraic function may also be defined by starting with a polynomial system 
of the form 


Pi(Z, Y1,--+5 Ym) => 0 
(77) ; ca 


Pin(Z, Vis ++ +> Ym) — 0, 


where each P; is a polynomial. A solution of the system (77) is by definition an m— 
tuple (fi,..., fim) that cancels each P;; that is, Pj(z, fi,..., fm) = 0. Any of the 
fj is called a component solution. A basic but non-trivial result of elimination theory 
is that any component solution of a non-degenerate polynomial system is an algebraic 
series (Appendix B.1: Algebraic elimination, p. 739). In other words, one can elimi- 
nate the auxiliary variables y2,..., y, and construct a single bivariate polynomial Q 
such that Q(z, y;) = 0. 

We stress the point that, in the definitions by an equation (76) or a system (77), 
no positivity of any sort nor irreducibility is assumed. The analysis which is now pre- 
sented applies to any algebraic function, whether or not it comes from combinatorics. 
> VII.32. Algebraic definition of algebraic series. It is also customary to define f to be an 
algebraic series if it satisfies P(z, f) = 0 in the sense of formal power series, without a priori 
consideration of convergence issues. Then the technique of majorant series may be used to 


prove that the coefficients of f grow at most exponentially. Thus, the alternative definition is 
indeed equivalent to Definition VIL.6. dq 


> VIL.33. “Alg is in Diag of Rat”. Every algebraic function F(z) over C(z) is the diagonal of 
a rational function G(x, y) = A(x, y)/B(x, y) € C(, y). Precisely: 


F(z) = » Grn", where G(x, y)= + Ginnx™y”. 
n>0 m,n>0 


This is implied by a theorem of Denef and Lipshitz [154], which is related to the holonomic 
framework (Appendix B.4: Holonomic functions, p. 748). dq 
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Figure VII.15. The real section of the lemniscate of Bernoulli defined by P(z, y) = 
(22 + y?)? - (22 - y?) = 0: the origin is a double point where two analytic branches 
meet; there are also two real branch points at z = +1. 


> VIL34. Multinomial sums and algebraic coefficients. Let F(z) be an algebraic function. 
Then F, = [z”]F(z) is a (finite) linear combination of “multinomial forms” defined as 


h 
Sn (C5 hs c1,..-5¢r) => ( as Je nner, 


Cc N1,-++5Mr 


where the summation is over all values of ng, nj,...,, Satisfying a collection of linear in- 
equalities C involving n. [Hint: a consequence of Denef—Lipshitz.] Consequently: coefficients 
of any algebraic function over Q(z) invariably admit combinatorial (i.e., binomial) expres- 
sions”. (Eisenstein’s lemma, p. 505, can be used to establish algebraicity over Q(z).) An 
alternative proof can be based on Note IV.39, p. 270, and Equation (31), p. 753. 


VII. 7.1. Singularities of general algebraic functions. Let P(z, y) be an irre- 
ducible polynomial of C[z, y], 


P(z, y) = polz)y* + pilz)y? | ++ ++ + palz). 


The solutions of the polynomial equation P(z, y) = 0 define a locus of points (z, y) 
in C x C that is known as a complex algebraic curve. Let d be the y-degree of P. 
Then, for each z there are at most d possible values of y. In fact, there exist d values 
of y “almost always”, that is except for a finite number of cases. 


— If zo is such that po(zo) = 0, then there is a reduction in the degree in y and 
hence a reduction in the number of finite y-solutions for the particular value 
of z = zo. One can conveniently regard the points that disappear as “points 
at infinity” (formally, one then operates in the projective plane). 

— If zo is such that P(zo, y) has a multiple root, then some of the values of y 
will coalesce. 


Define the exceptional set of P as the set (R is the resultant of Appendix B.1: Alge- 
braic elimination, p. 739): 


(78) [P] := {z | R(z) =0}, R@) :=R(P(z, y), dy Pz, y), y). 


The quantity R(z) is also known as the discriminant of P(z, y), with y as the main 

variable and z a parameter. If z ¢ S[P], then we have a guarantee that there exist 
d distinct solutions to P(z, y) = 0, since po(z) # 0 and 0, P(z, y) # 0. Then, by 
the Implicit Function Theorem, each of the solutions y; lifts into a locally analytic 
function y;(z). A branch of the algebraic curve P(z, y) = 0 is the choice of such a 
y;(z) together with a simply connected region of the complex plane throughout which 
this particular y;(z) is analytic. 


ie] 
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Singularities of an algebraic function can thus only occur if z lies in the excep- 
tional set =[P]. At a point zp such that po(zo) = 0, some of the branches escape to 
infinity, thereby ceasing to be analytic. At a point zo where the resultant polynomial 
R(z) vanishes but po(z) 4 0, then two or more branches collide. This can be either 
a multiple point (two or more branches happen to assume the same value, but each 
one exists as an analytic function around zo) or a branch point (some of the branches 
actually cease to be analytic). An example of an exceptional point that is not a branch 
point is provided by the classical lemniscate of Bernoulli: at the origin, two branches 
meet while each one is analytic there (see Figure VII.15). 

A partial knowledge of the topology of a complex algebraic curve may be ob- 
tained by first looking at its restriction to the reals. Consider for instance the polyno- 
mial equation P(z, y) = 0, where 


P(z,y)=y—1-2zy’, 
which defines the OGF of the Catalan numbers. A rendering of the real part of the 
curve is given in Figure VII.16. The complex aspect of the curve, as given by S(y) as 
a function of z, is also displayed there. In accordance with earlier observations, there 
are normally two sheets (branches) above each point. The exceptional set is given by 
the roots of the discriminant, 
R = z(1 — 42), 

that is, z = 0, i: For z = 0, one of the branches escapes at infinity, while for z = 1/4, 
the two branches meet and there is a branch point: see Figure VII.16. 

In summary the exceptional set provides a set of possible candidates for the sin- 
gularities of an algebraic function. 


Lemma VII.4 (Location of algebraic singularities). Let y(z), analytic at the origin, 
satisfy a polynomial equation P(z, y) = 0. Then, y(z) can be analytically continued 
along any simple path emanating from the origin that does not cross any point of the 
exceptional set defined in (78). 


Proof. At any Zo that is not exceptional and for a yo satisfying P(zo, yo) = 0, the fact 
that the discriminant is non-zero implies that P (zo, y) has only a simple root at yo, and 
we have Py(zo, yo) # 0. By the Implicit Function Theorem, the algebraic function 
y(z) is analytic in a neighbourhood of zo. | 


Nature of singularities. We start the discussion with an exceptional point that 
is placed at the origin (by a translation z +» z+ zo) and assume that the equation 
P(O, y) = 0 has k equal roots y,,..., yz, where y = 0 is this common value (by a 
translation y + y+ yo or an inversion y +} 1/y, if points at infinity are consid- 
ered). Consider a punctured disc |z| < r that does not include any other exceptional 
point relative to P. In the argument that follows, we let y1, (z),..., yg(z) be analytic 
determinations of the root that tend to 0 as z > 0. 

Start at some arbitrary value interior to the real interval (0, 7), where the quantity 
y1(z) is locally an analytic function of z. By the implicit function theorem, y;(z) can 
be continued analytically along a circuit that starts from z and returns to z while simply 
encircling the origin (and staying within the punctured disc). Then, by permanence of 
ad 
1 


analytic relations, y;(z) will be taken into another root, say, y )(z). By repeating the 
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Figure VII.16. The real section of the Catalan curve (top). The complex Catalan 
curve with a plot of S(y) as a function of z = (R(z), 3(z)) (bottom left); a blow-up 
of 3(y) near the branch point at z = 1/4 (bottom right). 


process, we see that, after a certain number of times « with | < x < k, we will have 
obtained a collection of roots y;(z) = y) (QDyaces ye) = y1(z) that form a set of 
x distinct values. Such roots are said to form a cycle. In this case, yj (t*) is an analytic 
function of t except possibly at 0 where it is continuous and has value 0. Thus, by 
general principles (regarding removable singularities, see Morera’s Theorem, p. 743), 
it is in fact analytic at 0. This in turn implies the existence of a convergent expansion 
near 0: 


(79) yi) = dent”. 
n=1 


(The parameter ¢ is known as the local uniformizing parameter, as it reduces a multi- 
valued function to a single-valued one.) This translates back into the world of z: each 
determination of z!/* yields one of the branches of the multivalued analytic function 
as 


(80) yi(z) = ye, 
n=1 
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Alternatively, with w = e7'*/" a root of unity, the x determinations are obtained as 
CO 
Gj 
ye. (z) = Ds eo, 
n=1 


each being valid in a sector of opening < 22. (The case x = | corresponds to an 
analytic branch.) 

If « = k, then the cycle accounts for all the roots which tend to 0. Otherwise, 
we repeat the process with another root and, in this fashion, eventually exhaust all 
roots. Thus, all the k roots that have value 0 at z = 0 are grouped into cycles of size 
K1,..., Ke. Finally, values of y at infinity are brought to zero by means of the change 
of variables y = 1 /u, then leading to negative exponents in the expansion of y. 


Theorem VII.7 (Newton—Puiseux expansions at a singularity). Let f(z) be a branch 
of an algebraic function P(z, f(z)) = 0. In a circular neighbourhood of a singu- 
larity ¢ slit along a ray emanating from ¢, f(z) admits a fractional series expansion 
(Puiseux expansion) that is locally convergent and of the form 


fO= >i ae-Or, 


k>ko 


for a fixed determination of (z — ¢)'/", where ko € Z and x is an integer > 1, called 
the “branching type”. 


Newton (1643-1727) discovered the algebraic form of Theorem VII.7 and pub- 
lished it in his famous treatise De Methodis Serierum et Fluxionum (completed in 
1671). This method was subsequently developed by Victor Puiseux (1820-1883) so 
that the name of Puiseux series is customarily attached to fractional series expansions. 
The argument given above is taken from the neat presentation offered by Hille in [334, 
Ch. 12, vol. I]. It is known as a “monodromy argument’, meaning that it consists in 
following the course of values of an analytic function along paths in the complex plane 
till it returns to its original value. 

Newton polygon. Newton also described a constructive approach to the determi- 
nation of branching types near a point (zg, yo), that, by means of the previous dis- 
cussion, can always be taken to be (0, 0). In order to introduce the discussion, let us 
examine the Catalan generating function near z9 = 1/4. Elementary algebra gives the 
explicit form of the two branches 


n@)=5-(I-vI=#), n@ == (14 VI=#), 


whose forms are consistent with what Theorem VII.7 predicts. If however one starts 
directly with the equation, 


P(z,y)=y—1-zy?=0 


13 From the general discussion, if kg < 0, then x = | is possible (case f(¢) = oo, with a polar 
singularity); if kg > 0, then a singularity only exists if « > 2 (case of a branch point with | f(©)| < 0). 
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then, the translation z = 1/4 — Z (the minus sign is a mere notational convenience), 
y=2+4Y yields 


1 
(81) O(Z,Y)= gle PAL AcE PAY", 


Look for solutions of the form Y = cZ%(1 + 0(1)) with c 4 0, whose existence is a 
priori granted by Theorem VII.7 (Newton—Puiseux). Each of the monomials in (81) 
gives rise to a term of a well-determined asymptotic order, respectively, Z7%, Z!, 
Z°+!, 724+1_ Tf the equation is to be identically satisfied, then the main asymptotic 
order of Q(Z, Y) should be 0. Since c ¥ 0, this can only happen if two or more of the 
exponents in the sequence (2a, 1,a + 1,2a + 1) coincide and the coefficients of the 
corresponding monomial in P(Z, Y) is zero, a condition that is an algebraic constraint 
on the constant c. Furthermore, exponents of all the remaining monomials have to be 
larger since by assumption they represent terms of lower asymptotic order. 

Examination of all the possible combinations of exponents leads one to discover 
that the only possible combination arises from the cancellation of the first two terms 
of Q, namely —iY ? + 4Z, which corresponds to the set of constraints 


t= 1 Uli 0 
a=1, —-C = 0, 
A 


with the supplementary conditions a@ + 1 > 1 and 2a +1 > 1 being satisfied by this 
choice a = 1/2. We have thus discovered that Q(Z, Y) = 0 is consistent asymptoti- 
cally with 

Yaz px cag, 


The process can be iterated upon subtracting dominant terms. It invariably gives 
rise to complete formal asymptotic expansions that satisfy Q(Z, Y) = 0 (in the Cata- 
lan example, these are series in +Z!/*), Furthermore, elementary majorizations estab- 
lish that such formal asymptotic solutions represent indeed convergent series. Thus, 
local expansions of branches have indeed been determined. 

An algorithmic refinement (also due to Newton) is known as the method of New- 
ton polygons. Consider a general polynomial 


Q(Z,Y) =D. zy", 

jet 
and associate to it the finite set of points (a;, b;) in N x N, which is called the Newton 
diagram. It is easily verified that the only asymptotic solutions of the form Y « Z* 
correspond to values of z that are inverse slopes (i.e., Ax/Ay) of lines connecting 
two or more points of the Newton diagram (this expresses the cancellation condition 
between two monomials of Q) and such that all other points of the diagram are on this 
line or to the right of it (as the other monomials must be of smaller order). In other 
words: 


Newton’s polygon method. Any possible exponent t such that Y ~ cZ* is 
a solution to a polynomial equation corresponds to one of the inverse slopes 
of the left-most convex envelope of the Newton diagram. For each viable t, 
a polynomial equation constrains the possible values of the corresponding 
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Figure VIL17. The real algebraic curve defined by the equation P = (y — z2)(y? — 
Zz) (y? = z3) =e y? near (0, 0) (left) and the corresponding Newton diagram (right). 


coefficient c. Complete expansions are obtained by repeating the process, 
which means deflating Y from its main term by way of the substitution Y > 


Y—-cZ". 
Figure VII.17 illustrates what goes on in the case of the curve P = 0 where 
P@.y) = G=20’-DO-—T)—7y 


= yo —y3z— y4e2 4 y223 — 273 y3 4 ty 4 Z5y2 — 26, 
considered near the origin. As the factored part suggests, the curve is expected to 
resemble (locally) the union of two orthogonal parabolas and of a curve y = +27/? 
having a cusp, i.e., the union of 


2 3/2 
yer, yotyz, yas, 
respectively. It is visible on the Newton diagram that the possible exponents y « z* 
at the origin are the inverse slopes of the segments composing the envelope, that is, 


1 3 
t=2; Te 5. f= 2, 
2 2 
For computational purposes, once determined the branching type x, the value of 
ko that dictates where the expansion starts, and the first coefficient, the full expansion 
can be recovered by deflating the function from its first term and repeating the New- 
ton diagram construction. In fact, after a few initial stages of iteration, the method 
of indeterminate coefficients can always be eventually applied [Bruno Salvy, private 
communication, August 2000]. Computer algebra systems usually have this routine 
included as one of the standard packages; see [531]. 


VII. 7.2. Asymptotic form of coefficients. The Newton—Puiseux theorem de- 
scribes precisely the local singular structure of an algebraic function. The expansions 
are valid around a singularity and, in particular, they hold in indented discs of the type 
required in order to apply the formal translation mechanisms of singularity analysis. 
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Theorem VII.8 (Algebraic asymptotics). Let f(z) = ><, fnz” be the branch of an 
algebraic function that is analytic at 0. Assume that f(z) has a unique dominant 
singularity at z = a, on its circle of convergence. Then, in the non-polar case, the 
coefficient f, satisfies the asymptotic expansion, 


(82) far © an Seda ; 
k>ko 
where ky € Zand x is an integer > 2. In the polar case, k = 1 and ko < 0, the 
estimate (82) is to be interpreted as a terminating (exponential—polynomial) form. 
If f(z) has several dominant singularities |a,| = |a2| = --- = |a;|, then there 
exists an asymptotic decomposition (where € is some small fixed number, € > 0) 


(83) fn = > OP) + O((lail +0), 


j=l 


where each )(n) admits a complete asymptotic expansion, 


p(n) ay aa” Ss Gps : 
won) 


with either Ki) in Zand x; an integer > 2orKk; = land ko <0. 


Proof. An early version of this theorem appeared as [220, Th. D, p. 293]. The expan- 
sions granted by Theorem VII.7 are of the exact type required by singularity analysis 
(Theorem V1.4, p. 393). For multiple singularities, Theorem VI.5 (p. 398) based on 
composite contours is to be used: in that case each #) (n) is the contribution obtained 
by transfer of the corresponding local singular element. | 
In the case of multiple singularities, partial cancellations may occur in some of 

the dominant terms of (83): consider for instance the case of 
1 


Jl-&+2? 


where the function has two complex conjugate singularities with an argument not 
commensurate to z, and refer to the corresponding discussion of rational coefficients 
asymptotics (Subsection IV. 6.1, p. 263). Fortunately, such delicate arithmetic situa- 
tions tend not to arise in combinatorial situations. 


= 1+ 0.60z + 0.04z7 — 0.36z? — 0.40824 —--- , 


Example V11.18. Branches of unary-binary trees. The generating function of unary—binary 
trees (Motzkin numbers, pp. 68 and 396) is f(z) defined by P(z, f(z)) = 0 where 


P(,y)=y-z-z-z", 
so that 


1—z—vV1—2z-322  1-z-JV/0 +90 — 32) 
f@= 5) = 
z 22 
There exist only two branches: f and its conjugate f that form a 2-cycle at z = 1/3. The 


singularities of all branches are at 0, —1, 1/3 as is apparent from the explicit form of f or from 
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Figure VII.18. The real algebraic curve corresponding to non-crossing forests. 


the defining equation. The branch representing f(z) at the origin is analytic there (by a general 
argument or by the combinatorial origin of the problem). Thus, the dominant singularity of f(z) 
is at 1/3 and it is unique in its modulus class. The “easy” case of Theorem VII.8 then applies 
once f(z) has been expanded near 1/3. As a rule, the organization of computations is simpler 
if one makes use of the local uniformizing parameter with a choice of sign in accordance to the 
direction along which the singularity is approached. In this case, we set z = 1/3 — 0? and find 


95 63.3, 274 2997.5 Lee 
=1-3060 Oo” — —o Oo O+e--, O0=[=- ‘ 
f@) 15° —g* 3 ae ° + a 
This translates immediately into 
arene 15 505 8085 
= [e"If() ~ (1- + - ), 
fr =e VF@) Jane 16n  512n2——- 81923 
which agrees with the direct derivation of Example VI.3, p. 396. ..................0.0008 | 


> VII.35. Meta-asymptotics. Estimate the growth of the coefficients in the asymptotic expan- 
sions of Catalan and Motzkin (unary—binary trees) numbers. dq 


Example VY1.19. Branches of non-crossing forests. Consider the polynomial equation P(z, y) = 
0, where 


PR, y= te -2-3)y° + @+3)y—1, 
(see Figure VII.18 for the real branches) and the combinatorial GF satisfying P(z, F) = 0 
determined by the initial conditions, 
F(z) =14+2z2+ 727 + 33z> + 181z4 4+ 1083z> +--+. 
(EIS 4054727). F(z) is the OGF of non-crossing forests defined in Example VII.16, p. 485. 
The exceptional set is mechanically computed: its elements are roots of the discriminant 
R = —23(5z3 — 8z* — 32z + 4). 


Newton’s algorithm shows that two of the branches at 0, say yo and yo, form a cycle of length 2 
with yp = 1—./Z+ O(z), yo = 14+./Z+ O(z) while it is the “middle branch” yj = 1+z+0(z7) 
that corresponds to the combinatorial GF F(z). 
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The non-zero exceptional points are the roots of the cubic factor of R; namely 
Q = {—1.93028, 0.12158, 3.40869}. 


Let € = 0.1258 be the root in (0,1). By Pringsheim’s theorem and the fact that the OGF 
of an infinite combinatorial class must have a positive dominant singularity in [0, 1], the only 
possibility for the dominant singularity of yj (z) is é. 

For z near é, the three branches of the cubic give rise to one branch that is analytic with 
value approximately 0.67816 and a cycle of two conjugate branches with value near 1.21429 at 
z = €. The expansion of the two conjugate branches is of the singular type, 


atBV1—2z/é, 
where 


_ 43 18 as 35 Zee = a = s 27 
a= 37 + 37° 74° =1.21429, p= 5228 981¢ — 5290€- = 0.14931. 
The determination with a minus sign must be adopted for representing the combinatorial GF 
when z > é7 since otherwise one would get negative asymptotic estimates for the non-negative 
coefficients. Alternatively, one may examine the way the three real branches along (0, ¢) match 


with one another at 0 and at €—, then conclude accordingly. 


Collecting partial results, we finally get by singularity analysis the estimate 


Ben ( 1 ) Ks 
In = wo (14+0(-)], @ = = = 8.22469 
. Wan te ¢ 
with the cubic algebraic number € and the sextic B as above. ............ 0. cece eee eee ee | 


The example above illustrates several important points in the analysis of coeffi- 
cients of algebraic functions when there are no simple explicit radical forms. First, 
a given combinatorial problem determines a unique branch of an algebraic curve at 
the origin. Next, the dominant singularity has to be identified by “connecting” the 
combinatorial branch with the branches at every possible singularity of the curve. Fi- 
nally, computations tend to take place over algebraic numbers and not simply rational 
numbers. 

So far, examples have illustrated the common situation where the function’s ex- 
ponent at its dominant singularity is 1/2. Our last example shows a case where the 
exponent assumes a different value, namely 1/4. 


Example V11.20. Branches of supertrees. Consider the quartic equation 
y =2y° + 4+22)y? —2yz+4z3 =0 
and let K be the branch analytic at 0 determined by the initial conditions: 
K(z) = 227422348244 1829 + +6429 +4 18827 +---. 


The OGF K corresponds to bicoloured supertrees of Example VI.10, p. 412; a partial graph is 
represented in Figure VII.19. 
The discriminant is found to be 


R=16z4 (162? +42 - 1) (-14 42), 


with roots at 1/4 and (—1 + /5)/8. The dominant singularity of the branch of combinatorial 
interest turns out to be at z = i where K (1/4) = 1/2. The translation z = 1/4+Z, y =1/2+Y 
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Figure VII.19. The real algebraic curve associated with the generating function of 
supertrees of type K. 


then transforms the basic equation into 
4Y¥448ZY" 4162? +1227+Z=0. 


According to Newton’s polygon method, the main cancellation arises from 4Y* + Z = 0: this 
corresponds to a segment of inverse slope 1/4 in the Newton diagram and accordingly to a cycle 
formed with four conjugate branches, i.e., a fourth-root singularity. Thus, one has 


RQ. BAI ey pests. NK Oe a 
es ali ‘) ali ‘) IK O) 1 S00 Brn 


which is consistent with values found earlier (p. 412). 2.0.0... cece cece eee eee eee || 


Computable coefficient asymptotics. The previous discussion contains the germ 
of a complete algorithm for deriving an asymptotic expansion of coefficients of any 
algebraic function. We sketch in Note VII.36 the main principles, while leaving some 
of the details to the reader. Observe that the problem is a connection problem: the 
“shapes” of the various sheets around each point (including the exceptional points) are 
known, but it remains to connect them together and see which ones are encountered 
first when starting with a given branch at the origin. 


> VIIL.36. Algebraic Coefficient Asymptotics (ACA). Here is an outline of the algorithm. 


Algorithm ACA: 


Input: A polynomial P(z, y) with d = deg, P(z, y); a series Y(z) such that P(z, Y) = 0 and 
assumed to be specified by sufficiently many initial terms so as to be distinguished from all 
other branches. 


Output: The asymptotic expansion of [z”]Y (z) whose existence is granted by Theorem VIL8. 
The algorithm consists of three main steps: Preparation (I), Dominant singularities (II), and 
Translation (II). 


I. Preparation: Define the discriminant R(z) = R(P, Pi, y). 
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(P;) Compute the exceptional set = = {z | R(z) = 0} and the points of infinity =g = 
{z | Po(z) = 0}, where po(z) is the leading coefficient of P(z, y) considered as a 
function of y. 

(Pz) Determine the Puiseux expansions of all the d branches at each of the points of 
= U {0} (by Newton diagrams and/or indeterminate coefficients). This includes the 
expansion of analytic branches as well. Let {yq, j ()}4_4 be the collection of all 


such expansions at some a € = U {0}. 
(P3) Identify the branch at 0 that corresponds to Y (z). 


I. Dominant singularities: (Controlled approximate matching of branches). Let = , =o,... 
be a partition of the elements of = U {0} sorted according to the increasing values of their mod- 
ulus: it is assumed that the numbering is such that ifa@ ¢ 5; and B € 4, then |a| < |A| is 
equivalent toi < j. Geometrically, the elements of & have been grouped in concentric circles. 
First, a preparation step is needed. 

(D,) Determine a non-zero lower bound 6 on the radius of convergence of any local 
Puiseux expansion of any branch at any point of &. Such a bound can be con- 
structed from the minimal distance between elements of © and from the degree d of 
the equation. 

The sets & ; are to be examined in sequence until it is detected that one of them contains a sin- 
gularity. At step j, let 1, 02,..., 0s be an arbitrary listing of the elements of & ;. The problem 
is to determine whether any ox is a singularity and, in that event, to find the right branch to 
which it is associated. This part of the algorithm proceeds by controlled numerical approxima- 
tions of branches and constructive bounds on the minimum separation distance between distinct 
branches. 

(D2) For each candidate singularity ox, with k > 2, set (% = ox (1 — 0/2). By assumption, 
each ¢x is in the domain of convergence of Y(z) and of any yg, j. 

(D3) Compute a non-zero lower bound 7; on the minimum distance between two roots of 
P(¢%, y) = 0. This separation bound can be obtained from resultant computations. 

(D4) Estimate Y (¢x) and each yg, ; (¢x) to an accuracy better than x /4. If two elements, 
Y(z) and yog,, ;(z) are (numerically) found to be at a distance less than x for z = 
Ck, then they are matched: ox is a singularity and the corresponding yg,,; is the 
corresponding singular element. Otherwise, o; is declared to be a regular point for 
Y (z) and discarded as candidate singularity. 

The main loop on j is repeated until a singularity has been detected, when j = jg, say. The 
radius of convergence p is then equal to the common modulus of elements of & j,; the corre- 
sponding singular elements are retained. 


Il. Coefficient expansion: Collect the singular elements at all the points o determined to 
be a dominant singularity at Phase II. Translate termwise using the singularity analysis rule, 


T(—p/« +n) 
T(—p/x)l(n +1)’ 
and reorganize into descending powers of n, if needed. dq 


(o — z)P/« > gP/Kon 


This algorithm vindicates the following assertion (see also Chabaud’s thesis [110]). 


Proposition VII.8 (Decidability of algebraic connections.). The dominant singular- 
ities of a branch of an algebraic function can be determined in a finite number of 
operations by the algorithm ACA of Note VII.36. 


> VII.37. Eisenstein’s lemma. Let y(z) be an algebraic function with rational coefficients (for 
instance a combinatorial generating function) satisfying ®(z, y(z)) = 0, where the coefficient 
of the polynomial ® are in C; then there exists a polynomial ¥ with integer coefficients such 
that ‘¥(z, y(z)) = 0. (Hint [65]. Consider the case where the coefficients of ® are Q-linear 
combinations of 1 and an irrational a, and write ®(z, y) = ®)(z, y) + a®,(z, y), where 
®,, 9, € Qlz, y]; extracting [z”]®(z, y(z)) would produce a Q-—linear relation between 1 
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and a, unless one of ®;, Dg is trivial, which must then be the case.) Thus, one can get ‘P(z, y) 
in Q[z, y], and by clearing denominators, in Z[z, y]. As a consequence, for algebraic y(z) with 
rational coefficients, there exists an integer B such that for all n, one has B”[z”]y(z) € Z. Since 


there are infinitely many primes, the functions e%, log(1 +z), >| z"/n?, © z"/(n!)3, and so on, 
are transcendental (i.e., not algebraic). J 


> VIL38. Powers of binomial coefficients. Define S;(z) := ipso Cy 2" with r € Z.. For 
even r = 2v the function S>,,(z) is transcendental (not algebraic) since its singular expansion 
involves a logarithmic term. For odd r = 2v + | andr > 3, the function S$, (z) is also 
transcendental as a consequence of the arithmetic transcendence of the number z; see [220]. 
These functions intervene in Pélya’s drunkard problem (p. 425). In contrast with the “hard” 
theory of arithmetic transcendence, it is usually “easy” to establish transcendence of functions, 
by exhibiting a local expansion that contradicts the Newton—Puiseux Theorem (p. 498). dq 


VII.8. Combinatorial applications of algebraic functions 


In this section, we introduce objects whose construction leads to algebraic func- 
tions, in a way that extends the basic symbolic method. This includes: walks with 
a finite number of allowed jumps (Subsection VII.8.1) and planar maps (Subsec- 
tion VII. 8.2). In such cases, bivariate functional equations reflect the combinatorial 
decompositions of objects. The common form of these functional equations is 


(84) O(z,u, F(z, u), h1(z),...,4-(z)) = 0, 


where ® is a known polynomial and the unknown functions are F and hj,...,h,. 
Specific methods are needed in order to attain solutions to such functional equations 
that would seem at first glance to be grossly underdetermined. Walks and excursions 
lead to a linear version of (84) that is treated by the so-called kernel method. Maps lead 
to nonlinear versions that are solved by means of Tutte’s quadratic method. In both 
cases, the strategy consists in binding z and u by forcing them to lie on an algebraic 
curve (suitably chosen in order to eliminate the dependency on F(z, u)), and then 
pulling out consequences of such a specialization. Asymptotic estimates can then be 
developed from such algebraic solutions, thanks to the general methods expounded in 
the previous section. 


VII. 8.1. Walks and the kernel method. Start with a set © that is a finite sub- 
set of Z and is called the set of jumps. A walk (relative to Q) is a sequence w = 
(wo, W1,.--, Wn) such that wo = O and w;4; — wi € Q, foralli,O <i <n. A 
non-negative walk (also known as a “meander’’) satisfies w; > O and an excursion is 
a non-negative walk such that, additionally, w, = 0. A bridge is a walk such that 
Wn = 0. The quantity n is called the length of the walk or the excursion. For in- 
stance, Dyck paths and Motzkin paths analysed in Section V. 4, p. 318, are excursions 
that correspond to Q = {—1,+1} and QO = {—1,0,+1}, respectively. (Walks and 
excursions are also somewhat related to paths in graphs in the sense of Section V. 5, 
p. 336.) 

We let —c denote the smallest (negative) value of a jump, and d denote the largest 
(positive) jump. A fundamental réle is played in this discussion by the characteristic 
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polynomial '* of the walk, 


d 
SO) := Diy? = Do Sy, 


weQ Fes 
which is a Laurent polynomial; that is, it involves negative powers of the variable y. . 


Walks. Observe first the rational character of the BGF of walks, with z marking 
length and u marking final altitude: 


1 
1—zS(u) 


Since walks may terminate at a negative altitude, this is a Laurent series in uw. 


(85) W(z,u) = 


Bridges. The GF of bridges is formally [u°]W (z, u), since bridges correspond to 
walks that end at altitude 0. Thus one has 


1 1 du 
eo) a Qin / 1—zS(u) u’ 


upon integrating along a circle y that separates the small and large branches, as dis- 
cussed below. The integral can then be evaluated by residues: details are found in [27]; 
the net result is Equation (97), p. 511. 


Excursions and meanders. We propose next to determine the number F;, of ex- 
cursions of length n and type Q, via the corresponding OGF 


F(z)= >> Fyz": 


n=0 


In fact, we shall determine the more general BGF 


F(z, u):= > Fy,guéz", 
nk 


where F,,, is the number of non-negative walks (meanders) of length n and final 
altitude k (i.e., the value of w, in the definition of a walk is constrained to equal k). In 
particular, one has F(z) = F(z, 0). 

The main result of this subsection can be stated informally as follows (see Propo- 
sitions VIL.9, p. 510 and VII.10, p. 513 for precise versions): 


For each finite set Q. € Z, the generating function of excursions is an alge- 
braic function that is explicitly computable from Q.. The number of excur- 
sions of length n satisfies asymptotically a universal law of the form 


CA 
'4Tf Q is a set, then the coefficients of S lie in {0, 1}. The treatment presented here applies in all 


generality to cases where the coefficients are arbitrary positive real numbers. This accounts for probabilistic 
situations as well as multisets of jump values. 
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There are many ways to view this result. The problem is usually treated within proba- 
bility theory by means of Wiener—Hopf factorizations [515], and Lalley [396] offers an 
insightful analytic treatment from this angle. On another level, Labelle and Yeh [392] 
show that an unambiguous context-free specification of excursions can be systemat- 
ically constructed, a fact that is sufficient to ensure the algebraicity of the GF F(z). 
(Their approach is implicitly based on the construction of a pushdown automaton it- 
self equivalent, by general principles, to a context-free grammar.) The Labelle—Yeh 
construction reduces the problem to a large, but somewhat “blind”, combinatorial pre- 
processing. Accordingly, for analysts, it has the disadvantage of not extracting a sim- 
pler analytic (but non-combinatorial) structure inherent in the problem: the shape of 
the end result can indeed be predicted by the Drmota—Lalley-Woods Theorem, but the 
nature of the constants involved is not clearly accessible in this way. 

The kernel method. The method described below is often known as the kernel 
method. It takes some of its inspiration from exercises in the 1968 edition of Knuth’s 
book [377] (Ex. 2.2.1.4 and 2.2.1.11), where a new approach was proposed to the 
enumeration of Catalan and Schréder objects. The technique has since been extended 
and systematized by several authors; see for instance [26, 27, 86, 202, 203] for relevant 
combinatorial works. Our presentation below follows that of Lalley [396] and of 
Banderier and Flajolet [27]. 

The polynomial f,(u) = [z”] F(z, u) is the generating function of non-negative 
walks of length n, with u recording final altitude. A simple recurrence relates fy+1(u) 
to fn(u), namely, 


(87) fn4iU) = S(u)+ fr) — rn), 
where r,(u) is a Laurent polynomial consisting of the sum of all the monomials of 
S(u) fn(u) that involve negative powers!> of u: 
-1 
(88) rn(u) = 2 wl (lu! SW) fru) = (u} SW) fr). 
jJ=—c 


The idea behind the formula is to subtract the effect of those steps that would take the 
walk below the horizontal axis. For instance, one has 


S(u) = 1 + O(1), so that r,(u) = = fy(0) 


su) = += 400, so that rp(u) = (S245) in) + = KO). 


(This technique is similar to that of “adding a slice”, p. 199.) 
Generally, set 


(89) 2 j(u) = —{u~°Ju! Su). 


'5The convenient notation {u <0} denotes the singular part of a Laurent expansion: {u <0} {@ := 


Dj <0 (1) - 
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Then, from (87) and (88) (multiply by z"+1 and sum), the generating function F(z, u) 
satisfies the fundamental functional equation 


(90) Fu) =14+2Su)F(z,u) — c{u~?} (SW)F E,W). 
Thus, one has, explicitly, 


c-1 


(91) F(z,u) = 1+zS(u)F(z,u) —z >) Aju) Fewcu) , 
j=0 ue 


where the Laurent polynomials 4 ;(u) depend on S(w) in an effective way by (89). 

The main equations (90) and (91) involve one unknown bivariate GF, F(z, u) 
and c univariate GFs, the partial derivatives of F specialized at u = 0. It is true, but 
not at all obvious, that the single functional equation (91) fully determines the c + 1 
unknowns. The basic technique is known as “cancelling the kernel” and it relies on 
strong analyticity properties; see the book by Fayolle er al. [203] for deep ramifica- 
tions in the study of two-dimensional walks. The form of (91) to be employed for this 
purpose starts by grouping on one side the terms involving F(z, u), 

c-1 ai 
(92) F(z,u)(1—zS(u))=1-z > AjWGj@, — Gj@) i= | Gre. | 
j=0 

If the right-hand side sum was not present, then the solution would reduce to (85). In 
the case at hand, from the combinatorial origin of the problem and implied bounds, 
the quantity F(z, u) is bivariate analytic at (z, vu) = (0, 0) (by elementary exponential 
majorizations on the coefficients). The main principle of the kernel method consists 
in coupling the values of z and u in such a way that 1 — zS(u) = 0, so that F(z, u) 
disappears from the picture. A condition is that both z and u should remain small (so 
that F remains analytic). Relations between the partial derivatives are then obtained 
from such a specialization, (z,u) +» (z, u(z)), which happen to be just in the right 
number. 

Consequently, we consider the “kernel equation’, 
(93) 1—zS(u) =0, 
which is rewritten as 

uS =z: (u°S(u)). 

Under this form, it is clear that the kernel equation (93) defines c + d branches of an 
algebraic function. A local analysis shows that, among these c + d branches, there are 
c branches that tend to 0 as z — 0, whereas the other d tend to infinity as z > 0. (The 
idea is that, in the equation (93), either one of zu~° ~© 1 or za © 1 predominates; 
equivalently, a Newton polygon can be constructed.) Let uo(z),..., %c—1(z) be the 
c branches that tend to 0, that we call “small” branches. In addition, we single out 
uo(z), the “principal” solution, by the reality condition 


ug(z)~ yz'/®, y:= (Se) Ee Roo (2307). 
By local uniformization (see (79), p. 497), the conjugate branches are given locally by 


ug(z) = uo(e!** z) (z > 07). 
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Coupling z and u by u = u¢(z) produces interesting specializations of Equa- 
tion (92). In that case, (z, u) is close to (0, 0) where F is bivariate analytic so that the 
substitution is admissible. By substitution, we get 


c-l ; 
os 
(94) ce) £=0..c—1. 


This is now a linear system of c equations in c unknowns (namely, the partial deriva- 
tives) with algebraic coefficients that, in principle, determine F(z, 0). 

A convenient approach to the solution of (94) is due to Mireille Bousquet-Mélou. 
The argument goes as follows. The quantity 


c-l ; 

o/ 

(95) Mu) :=uS — zu > Aj (w)=FFE0) 
j=0 


can be regarded as a polynomial in w. It is monic while it vanishes by construction at 


the c small branches ug, ..., 4¢-_1. Consequently, one has the factorization, 
c-1 

(96) M(u) = | [w= ue). 
t=0 


Now, the constant term of M(u) is otherwise known to equal —zS_- F(z, 0), by the 
definition (95) of M(u) and by Equation (89) specialized to 29(u). Thus, the compar- 
ison of constant terms between (95) and (96) provides us with an explicit form of the 
OGF of excursions: 


pa 


[][ <@. 


c=0 


F(z,0) = 

=—¢ 

One can then finally return to the original functional equation and pull the BGF F(z, uw). 
In summary: 


Proposition VIL9. Let Q be a finite step of jumps and let S(u) be the characteristic 
polynomial of Q. Consider the c small branches of the “kernel” equation, 


1-—zS(u) = 0, 


denoted by uo(z), ..., Uc—1(Z). The generating function of excursions is given by 


DIF - 
F@)= =. I] ue(Z), where S_¢ = [u~°]S(u) 
“© ¢=0 


is the multiplicity (or weight) of the smallest element —c € Q. More generally the 
bivariate generating function of non-negative walks (meanders) with u marking final 
altitude is bivariate algebraic and given by 
I cl 
F(z, u) = ——— | J = we(2)). 


uc — zucS(u) rar, 
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The OGF of bridges is expressible in terms of the small branches, by 


Sui) d 
(97) BQ) =2>) “5 = 5 log (ui (2) + tel). 
jot! 


(The proof of (97) is based on a residue evaluation of (86), p. 507.) 


Example V¥I.21.— Trees and Lukasiewicz codes. A particular class of walks is of special 
interest; it corresponds to cases where c = 1; that is, the largest jump in the negative direction 
has amplitude 1. Consequently, Q+ 1 = {0, 51, 52,..., 5g}. In that situation, combinatorial 
theory teaches us the existence of fundamental isomorphisms between walks defined by steps 
Q and trees whose degrees are constrained to lie in 1 + Q. The correspondence is by way of 
Lukasiewicz codes!®, also known as ‘Polish” prefix codes introduced in Chapter I. From this 
correspondence, we expect to find tree GFs in such cases. 

With regard to generating functions, there now exists only one small branch, namely the 
solution ug(z) to ug(z) = zh(ug(z)) (where d(u) = uS(u)) that is analytic at the origin. One 
then has F(z) = F(z,0) = Zug(z), so that the walk GF is determined by 


F(,0) = uo), m=O). oo) =usw. 


This form is consistent with what is already known regarding the enumeration of simple families 
of trees. In addition, one finds 


—u7!uo(z) _ u—uo(2) 
1—zS(u) uz (u)’ 


Classical cases are rederived in this way: 


— the Catalan walk (Dyck path), defined by Q = {—1, +1} and J(u) = 1 + v?, has 
1 
ug(z) = — (1-vi= 424); 
2Z 
— the Motzkin walk, defined by Q = {—1,0, +1} and ¢(u) = 14+ u +? has 
1 
ug(z) = = (1-z-v1- 22-37); 
2z 


— the modified Catalan walk, defined by Q = {—1, 0, 0, +1} (with two steps of type 0) 
and ¢(u) = 1+ 2u + u2, has 


uo(2) =p (1-22- vI=#); 


F(z,u) = Z 


— the d—ary tree walk (the excursions encode d-ary trees) defined by Q = {—1, d—1}, 
has ug(z) that is defined implicitly by ug(z) = zd + ug(z)“). 


The kernel method thus provides a new perspective for the enumeration of Dyck paths and 
Telated Objects. #30... .iwseatGth akiguidies Veer asteee asa dae nraar greta kavabee var Mtiw key | 


!6Such a code (p. 74) is obtained by a preorder traversal of the tree, recording a jump of r — | when a 
node of outdegree r is encountered. The sequence of jumps gives rise to an excursion followed by an extra 
—1 jump. 
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Example V1I.22.. Walks with amplitude at most 2. Take Q = {—2, —1, 1, 2}, so that 
S(u) = eae) ames ae 
Then, wo(z), u1(z) are the two branches that vanish as z — 0 of the curve 
yaedityty ty). 
The linear system that determines F(z, 0) and F/ (z, 0) is 


1 (45 + 5) Fe O- SE A@O = 0 
ms ,0) —- —~F,(z,0) = 0 
(Gap t nm) POO Geo 


(derivatives are taken with respect to the second argument) and one finds 


1 1 
F(,0) = ~~ uo(2)u1 @), Fy, (z, 0) = 7 Ho) + wi @) + uo)u1 @). 
This gives the number of walks, through a combination of series expansions, 
F(z) = 14227 +223 + 1124 + 242° + 9326 4 27227 +. 97128 + 319479 +--+, 


A single algebraic equation for F(z) = F(z,0) is then obtained by elimination (e.g., via 
Grobner bases) from the system: 


ug — 2 + uo +19 + Hp) = 0 
uy — (1 +uy +u; +47) = 0 
zF+ugu, = 0 


Elimination shows that F(z) is a root of the equation 
zgiy4 oe 2z)y? +72(2+ 3z)y* —(1+4+2z)y+1=0. 


For QO = {—2, —1, 0, 1, 2}, we find similarly F(z) = — Lug(z)u4 (z), where ug, uy are the 
small branches of y? =zit+tyt+ y? + y3 + y*); the expansion starts as 


F(z) =14+z+322 +923 + 3224 + 12025 + 47326 + 192527 + 8034z8 +... , 
(EIS A104184; see also [441]), and F(z) is a root of the equation 
zty4 - 2 (1 + zy? +2(24+ zy* —(+z)y+1=0. 


In such cases, the GFs are no longer of the simple tree type. .....................0.00085 |_| 


Asymptotic analysis. The singularities of the branches involved in the statement 
of Proposition VII.9 can be worked out in all generality [27, 396]. The roots of the 
kernel equation (93) are singular at points z with value u satisfying the simultaneous 
set of equations, 

1 —zS(u) = 0, S'(u) = 0, 

where the second equation corresponds to a place where the analytic implicit function 
theorem “fails” to define u as an analytic function of z. The second equation always 
has a positive root t, corresponding to a positive value of z, which is p = 1/S(r). It 
is then natural to suspect p to be radius of convergence of F(z) and the singularity to 
be of the square-root type (Z!/*), this for reasons seen in the proof of Theorem VII.3 
(the smooth implicit-function schema). These properties are shown in complete detail 
in the articles [27, 395, 396], where it is also established that the GF of bridges is of 
singular type Z~!/?, as in the case of Dyck bridges. 
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Proposition VII.10. Define the structural constant t by S’(t) = 0, t > 0. Then 
assuming aperiodicity, the number of bridges (Bn) and the number of excursions (Fy) 
satisfy 

S(r)” S(r)” 

Fr ~ €0— ==: 


Jinn Warn 


—1)c-l col 
a S(t) ie (-1) Ti ( 1 ); 
g=l 


Bn ~ Bo 


where 


t\ S"(t)’ Sc VY S(t) S(t) 


There, the uj represent the small branches and ug is the —principal” branch that is 
finite and real positive as z > 0. 


Proposition VII.10 expresses a universal law of type n~*/? for excursions and 
n—'/? for bridges, a fact otherwise at least partly accessible to classical probability 
theory (e.g., via a local limit theorem for bridges and via Brownian motion for ex- 
cursions). Basic parameters of walks, excursions, bridges, and meanders can then be 
analysed in a uniform fashion [27]. 


VII. 8.2. Maps and the quadratic method. A (planar) map is a connected pla- 
nar graph together with an embedding into the plane. In all generality, loops and 
multiple edges are allowed. A planar map therefore separates the plane into regions 
called faces. The maps considered here are in addition rooted, meaning that a face, an 
incident edge, and an incident vertex are distinguished. In this section, only rooted 
maps are considered. (Nothing is lost regarding asymptotic properties of random 
structures when a rooting is imposed. The reason is that a map has, with probabil- 
ity exponentially close to 1, a trivial automorphism group; consequently, almost all 
maps of m edges can be rooted in 2m ways—by choosing an edge, and an orienta- 
tion of this edge—and there is an almost uniform 2m-to-1 correspondence between 
unrooted maps and rooted ones.) When representing rooted maps, we shall agree to 
draw the root edge with an arrow pointing away from the root node, and to take the 
root face as that face lying to the left of the directed edge (represented in grey below): 


Tutte launched in the 1960s a large census of planar maps, with the intention of 
attacking the four-colour problem by enumerative techniques!’; see [96, 579, 580, 


'7The four-colour theorem to the effect that every planar graph can be coloured using only four colours 
was eventually proved by Appel and Haken in 1976, using structural graph theory methods supplemented 
by extensive computer search. 
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581, 582]. There is in fact a very large collection of maps defined by various degree 
or connectivity constraints. In this chapter, we shall limit ourselves to conveying a 
flavour of this vast theory, with the goal of showing how algebraic functions arise. 
The presentation takes its inspiration from the book of Goulden and Jackson [303, 
Sec. 2.9] 


The quadratic method. Let M be the class of all maps where size is taken to be 
the number of edges. Let M(z, wu) be the BGF of maps with u marking the number 
of edges on the outside face. The basic surgery performed on maps distinguishes two 
cases based upon the nature of the root edge. A rooted map will be declared to be 
isthmic if the root edge r of map yu is an “isthmus”; that is, an edge whose deletion 
would disconnect the graph. Clearly, one has 


(98) M=0+M0+4M™, 


where M (resp. MM) represent the class of isthmic (resp. non-isthmic) maps and 
‘o’ is the graph consisting of a single vertex and no edge. There are accordingly two 
ways to build maps from smaller ones by adding a new edge. 


(i) The class of all isthmic maps is constructed by taking two arbitrary maps and 
joining them together by a new root edge, as shown below: 


The effect is to increase the number of edges by | (the new root edge) and have the 
root face degree become 2 (the two sides of the new root edge) plus the sum of the 
root face degrees of the component maps. The construction is clearly revertible. In 
other words, the BGF of M“ is 


(99) M®(z,u) = zu?M(z,u)?. 


(ii) The class of non-isthmic maps is obtained by taking an already existing map 
and adding an edge that preserves its root node and “cuts across” its root face in some 
unambiguous fashion (so that the construction should be revertible). This operation 
will therefore result in a new map with an essentially smaller root-face degree. For 
instance, there are five ways to cut across a root face of degree 4; namely, 


u > ZU + Zu + ZU Sr ae ae, 
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In general the effect on a map with root face of degree k is described by the trans- 
formation ué + zu(1 — u*+!)/(1 — uw); equivalently, each monomial g(u) = ué is 
transformed into zu(g(1)—ug(u))/(1—u). Thus, the OGF of M“ involves a discrete 


difference operator: 


M(z, 1) —uM(z, u) 


l-u 


(100) M(z,u) = zu 


Collecting the contributions from (99) and (100) in (98) then yields the basic 
functional equation, 
M(z, 1) iz uM(z, u) 


(101) M(z,u) =1+u7?zM(z,u)* +uz = 


The functional equation (101) binds two unknown functions, M (z, wu) and M(z, 1). 
Similar to the case of walks, it would seem to be underdetermined. Now, a method 
due to Tutte and known as the quadratic method provides solutions. Following Tutte 
and the account in [303, p. 138], we consider momentarily the more general equation 


(102) (91 F(z, u) + 22)" = 83, 


where g; = Gj(z,u,h(z)) and the G; are explicit functions—here the unknown 
functions are F(z, u) and h(z) (cf M(z,u) and M(z, 1) in (101)). Bind u and z in 
such a way that the left side of (102) vanishes; that is, substitute u = u(z) (a yet 
unknown function) so that g] F + go = 0. Since the left-hand side of (102) now has a 
double root in u, so must the right-hand side, which implies 


0 
ee =i 


(103) g3 =0, = 
Ou u=u(z) 


The original equation has become a system of two equations in two unknowns that de- 
termines implicitly h(z) and u(z). From this system, elimination provides individual 
equations for u(z) and for h(z). (If needed, F(z, u) can then be recovered by solv- 
ing a quadratic equation.) It will be recognized that, if the quantities g1, g2, g3 are 
polynomials, then the process invariably yields solutions that are algebraic functions. 

We now carry out this programme in the case of maps and Equation (101). First, 
isolate M(z, u) by completing the square, giving 


2 
ll-—utu?z S M(z, 1) 


where 

z7ut — 2zu2(u — 1)2Qu —1)+(1—w?) 
4u4z2(1 — u)? : 

Next, the condition expressing the existence of a double root is 


2u—1 M(z,1) =0 
u2(1 —u)? sear a 


Q(z, u) = 


Q(z, u) + sa hee 1=0, QO) (z,u)t+ 
u(1 —u) 
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It is now easy to eliminate M (z, 1), since the dependency in M is linear, and a straight- 
forward calculation shows that u = u(z) should satisfy 


(uz + (u- )) (uc + (u— 1)Qu - 3)) = 0. 


The first parameterization would lead to M(z, 1) = 1/z which is not acceptable. Thus, 
u(z) is to be taken as the root of the second factor, with M(z, 1) being defined para- 
metrically by 

(1 — u)(2u — 3) 3u—4 

= —_——,, Mz, 1) = -u-———~—.. 

A u2 @,1) fi (2u — 3)? 

Asymptotic analysis. In principle, the problem of enumerating maps is solved 
by (105), albeit in a parameterized form. We can then eliminate u (for instance, by 
resultants) and get an explicit equation for M = M(z, 1): 


2722M2 — 182M + M+ 16z—-1=0. 


(105) 


This quadratic equation is explicitly solvable 
1 
MG) =-z5 (1 te 12z)9/?) 
z 


and its singular type is Z/? (with Z = (1 — 12z)). Summarizing, we obtain one of 
the very first results in the enumerative theory of maps. 


Proposition VII.11. The OGF of maps admits the explicit form 


1 
106 MQ) = MG, 1)=——— (1=-182—-a=122 ae 
(106) (2) = MG, 1) = ~ Fs (1-182 — (1 ~ 122) 
The number of maps with n edges, Mn = [z" |M(z, 1), satisfies 
2n)!3" 2 
(107) ip 12", 
ni(n + 2)! n> 


The sequence of coefficients is EJS A000168: 
(108) M(z, 1) = 1+2z2+9z7+54z74378z4 +2916z> +240572°+208494z’+---. 


We refer to [303, Sec. 2.9] for detailed calculations (that are nowadays routinely per- 
formed with the assistance of a computer algebra system). Currently, there exist many 
applications of the quadratic method to maps satisfying all sorts of combinatorial 
constraints, in particular multiconnectivity; see [533] for a panorama. Interestingly 
enough, the singular exponent of maps is universally 3/2, a fact further reflected by 
the n~>/? factor in the asymptotic form of coefficients. Accordingly, randomness 
properties of maps are appreciably different from what is observed in trees and many 
commonly encountered context-free objects (e.g., irreducible ones). 


> VIL39. Lagrangean parametrization of general maps. The change of parameter u = 1—1/w 
reduces (105) to the “Lagrangean form”, 


w 1—4w 
=, MG@,1I)=~—_>, 
1—3w (1 — 3w)2 


to which the Lagrange Inversion Theorem can be applied, giving back (107). dq 


(109) z 
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Figure VII.20. The “kitten”: a random irreducible triangulation with a quadrangu- 
lar outer face built out of 69 vertices and 200 edges. Left: a projection of a three- 
dimensional view (imagine the map drawn on a surface in R3). Right: a straight-line 
orthogonal rendering based on Fusy’s algorithm [274]. 


> VIL.40. Distances in maps. Chassaing and Schaeffer [113] have shown that the distance 


between two random vertices of a random planar map with n faces scales as n 1/4 whenn > oo. 
Le Gall [404] has proved that a rescaled planar triangulation converges to a random “continuum 
planar map” that has a spherical topology. See Figure VII.20 for some aspects of a random map. 
(Physicists study similar random planar structures under the name of 2-dimensional quantum 
gravity; see also Note VI.22, p. 414, for related material.) dq 


> VIL.41. Matrix integrals and maps. Consider an N x N Hermitian matrix H, such that 
KR(Ajj) = RAZ) =%,j and S(A;, jj) = -S(Aj,) = yi,;, 


and define the Gaussian measure of parameter / on the set of Hermitian matrices as (Tr is the 
matrix trace): 


Dp AN see ia 
dun(A; 4) := ( 7 ) grt I] dX;,i I] dxj, j Ai, ;- 
: i=l i<j 


Let M(t, v) be the multivariate generating function of rooted planar maps, where t marks the 
number of edges, v represents the vector of indeterminates (01, 02, ...), and v; marks the num- 


ber of vertices of degree 7. One has 


m 


d 1 oS OU# 
M(t,v) =—t— | lim —1 N 
corer | lin gate foe (8 Don, 


dun (H; N/t) 


m=1 


(For this rich theory largely originating with Bessis, Brézin, Itzykson, Parisi and Zuber [60, 94], 
see Zvonkin’s gentle introduction [630], Bouttier’s thesis [88], as well as [89] and references 
therein.) <i 


> VII.42. The number of planar graphs. The asymptotic number of labelled planar graphs 
with n vertices was determined by Giménez and Noy [290] to be of the form 


Gn~g- yn nh, g = 0.497004399,  y = 27.2268777685. 


This spectacular result, which settled a long standing open question, is obtained by a suc- 
cession of combinatorial and analytic steps based on: (i) the enumeration of 3—connected 
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maps (these are the same as graphs, due to unique embeddability), which can be performed 
by the quadratic method; (ii) the enumeration of 2—connected graphs by Bender, Gao, and 
Wormald [41]; (ii) the integro-differential relations that relate the GFs of 2—connected and 
1-connected graphs. The authors of [290] also show that a random planar graph is connected 
with probability asymptotic to e~” = 0.96325 and the mean number of connected components 
is asymptotic to 1 + v = 1.03743. See also the rich survey [291] for much more. J 


VII.9. Ordinary differential equations and systems 


In Part A of this book relative to Symbolic Methods, we have encountered differ- 
ential relations attached to several combinatorial constructions. 


— Pointing: the operation of pointing a specific atom in an object of a combi- 
natorial class C produces a pointed class D = OC. If the generating function 
of C is C(z) (an OGF in the unlabelled case, an EGF in the labelled case), 
then one has 


(110) D=0C => Die) =<5-C(). 


See Subsections I. 6.2 (p. 86) and II. 6.1 (p. 136). 

— Order constraints: in Subsection II. 6.3 (p. 139), we have defined the boxed 
product A = (B~ xC) to be the modified labelled product comprised of 
pairs of elements such that the smallest label is constrained to lie in the 6 
component. The translation over OGFs is 


(111) A=(Box0C) => Aw = | @.2@)-cwa. 
0 


Thus pointing and order constraints systematically lead to integro-differential relation, 
which can be transformed into ordinary differential equations (ODEs) and systems. 
Another rich source of differential equations in combinatorics is provided by the holo- 
nomic framework (Appendix B.4: Holonomic functions, p. 748). We summarize be- 
low some of the major methods that can be used to analyse the corresponding GFs. 
On the side of differential equations, our analytic arguments largely follow the ac- 
cessible introductions found in the books by Henrici [329] and Wasow [602]. Linear 
ODEs are examined in Subsection VII. 9.1, some simple nonlinear ODEs in Subsec- 
tion VII.9.2. The main applications discussed here are relative to trees associated to 
ordered structures—quadtrees and increasing trees principally. 


VII. 9.1. Singularity analysis of linear differential equations. Linear differ- 
ential equations with analytic coefficients have solutions that, near a reasonably well- 
behaved singularity ¢, are of the form 


Z° (log Z)‘ H(Z), Vi ee a 
with 6 € C an algebraic number, k € Zso, and H a locally analytic function. The 
coefficients of such equations are composed of elements that are asymptotically of the 
form 
n(lognk, fB=-0-1, 
in accordance with the general correspondence provided by singularity analysis. For 
instance, a naturally occurring combinatorial structure, the quadtree, gives rise to a 
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number sequence that, surprisingly, turns out to be asymptotically proportional to 
n(V17-3)/2. 


Regular singularities. Our starting point is a linear ordinary differential equa- 
tion (linear ODE), which we take to be of the form 
d 
dz 
The integer r is the order. We assume the existence of a simply connected domain Q 
in which the coefficients c; = c;(z) are analytic. At a point zo where co(zo) ¥ 0, a 
classical existence theorem (Note VII.43 and [602, p. 3]) guarantees that, in a neigh- 
bourhood of zo, there exist r linearly independent analytic solutions of (112). Thus, 
singularities can only occur at points ¢ that are roots of the leading coefficient co(z). 
> VII.43. Analytic solutions. Consider the ODE (112) near zg = O and assume co(0) # 0. 


Then, a formal solution Y(z) can be determined, given any set of initial conditions Y G )(0) = 
w ;, by the method of indeterminate coefficients. The coefficients can be constructed recurrently, 
and simple bounds show that they are of at most exponential growth. dq 


(112) co(z)0" ¥(z) te1(z)@" 1¥@)+.--+eY¥@)=0, 6= 


To proceed, we rewrite Equation (112) as 

(113) a"Y (z) + di(z)o"~!¥(z) +--+» +.d;(z)¥ (z) =0, 
where d; = c;/co. Under our assumptions, the functions d;(z) are now meromorphic 
in Q. Given a meromorphic function f(z), we define w; (f) to be the order of the pole 
of f at ¢, and w-(f) = 0 means that f(z) is analytic at ¢. 
Definition VII.7. The differential equations (112) and (113) are said to have a singu- 
larity at ¢ if at least one of the w;(d;) is positive. The point ¢ is said to be a regular 
singularity!8 if 

wc(d}) <1, we(d2) <2, ...,  @e(d-) <7, 
an irregular singularity otherwise. 


For instance, the second-order ODE 
(114) Y” +27! sin(z)Y’ — z~? cos(z)¥ = 0, 


has a regular singular point at z = 0, since the orders are 0, 2, respectively. It is a 
notable fact that, even though we do not know how to solve explicitly the equation in 
terms of the usual special functions of analysis, the asymptotic form of its solutions 
can be precisely determined. 

Let ¢ be a regular singular point, and say we attempt to solve (112) by trying a 
solution of the form Z? +---, where Z := z — C. For instance, proceeding somewhat 
optimistically with (114) at ¢ = 0, we may expect the left-hand side of the equation 
to be of the form 


[9 = 29? +. ] + [oct +] [eo +- J =o. 


In order to obtain cancellation to main asymptotic order (z?-), we must then assume 
that the coefficient of z’~? vanishes; then, @ solves an algebraic equation of degree 2, 
namely, 0(@ — 1) — 1 = 0, which suggests the possibility of two solutions of the form 


18For “irregular” singularities, see Section VIII. 7, p. 581. 
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2? near 0, with @ = (1 4 V5)/2. This informal discussion motivates the following 
definition. 


Definition VII.8. Given an equation of the form (113) and a regular singular point ¢, 
the indicial polynomial / (@) at ¢ is defined to be 


16) =054+ 60 4.--4+6, 0£:=06@-1)---@-£4)), 
where 6; := limz-4¢(z — od; (z). The indicial equation (at ¢) is the algebraic equa- 
tion I (0) = 0. 
If we let £ denote the differential operator corresponding to the left-hand side 
of (113), we have formally, at a regular singular point, 


B[2\=1@)2" 40(27 9): ZaGe; 


which justifies the role of the indicial polynomial. (The process used to determine 
the solutions by restricting attention to dominant asymptotic terms is analogous to 
the Newton polygon construction for algebraic equations.) An important structure 
theorem describes the possible types of solutions of a meromorphic ODE at a regular 
singularity. 

Theorem VII.9 (Regular singularities of ODEs). Consider a meromorphic differen- 
tial equation (113) and a regular singular point ¢. Assume that the indicial equation 
at ¢, [(@) = 0, is such that no two roots differ by an integer (in particular, all roots 
are distinct). Then, in a slit neighbourhood of ¢, there exists a linear basis of all the 
solutions that is comprised of functions of the form 


(115) (@—c)VHj@—), 
where 0, ..., 6, are the roots of the indicial polynomial and each H; is analytic at 0. 


In the case of roots differing by an integer (or multiple roots), the solutions (115) may 
include additional logarithmic terms involving non-negative powers of log(z — C). 


A description of the logarithmic cases is best based on a matrix treatment of 
the first-order linear system that is equivalent to the ODE [329, 602]. Note VII.44 
describes the main lines of a proof of Theorem VII.9; Note VII.45 discusses the rep- 
resentative case of Euler systems, which is explicitly solvable. 
> VII.44. Singular solutions. In the first case of Theorem VII.9 (no two roots differing by an 


integer), it suffices to work out the modified differential equation satisfied by Z —9jy (z) and 
verify that one of its solutions is analytic at ¢: the coefficients of H; satisfy a recurrence, as in 


the non-singular case, from which their growth is verified to be at most exponential. J 
> VIIL.45. Euler equations and systems. An equation of the form, 
eY+eZ 1a ly 4+--te-Z"¥=0, ef eC, Z:=(&-0), 


is known as an Euler equation. In the case where all roots of the indicial equation are simple, 
a basis of solutions is exactly of the form Zi. When @ is a root of multiplicity m, the set of 


solutions includes Z? (log Z)?, for p = 0,...,m — 1. (Euler equations appear for instance 
in the median-of-three quicksort algorithm [378, 538]. See [117] for several applications to 
random tree models and the analysis of algorithms.) Euler systems are first-order systems of 


the form < 

d 
— Y(z) = —— Y 
We ae F (z), 
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where A € C’*" is a scalar matrix and Y = (Yj,..., y,)t is a vector of functions. A formal 
solution is provided by 


(< — ¢)* = exp (Alog( —¢)), 
which indicates that the Jordan block decomposition of A plays a rdle in the occurrence of 
logarithmic factors of solutions. dq 


Theorem VII.10 (Coefficient asymptotics for meromorphic ODEs). Let f(z) be ana- 
lytic at 0 and satisfy a linear differential equation 


d'-! 


Stor aoe Se Pe bee) F GL= 9; 


where the coefficients cj; (z) are ae in |z| < pi, except for possibly a pole at 
some ¢ satisfying |C| < pi,¢ #0. Assume that ¢ is a regular singular point and no 
two roots of the indicial equation at ¢ differ by an integer. Then, there exist scalar 
constants 21,..., Ar € C such that for any po with |C| < po < pi, one has 


(116) lf@ = SajAj(n) +0 (p5 ar 


j=l 
where the A ;(n) are of the asymptotic form 
—O;- 


Si,j 
Tea a;y° er[iae at 


and the 0; are the roots of the indicial equation at ¢. 


(117) Ay (i) ~ 


Proof. The coefficients 1; relate the particular solution f(z) to the basis of solu- 
tions (115). The rest, by sugabicly analysis, is nothing but a direct transcription to 
coefficients of the solutions provided by the structure theorem, Theorem VII.9, with 
Ajn) = 2" — Hj — 6). 7 

Taking into account multiple roots (as in Note VII.45) and roots differing by an 
integer, we see that solutions to meromorphic linear ODEs, in the regular case at least, 
are only composed of linear combinations of asymptotic elements of the form!? 


(118) c7"n (logn)’, 


where ¢ is determined as root of a (possibly transcendental) equation, co(¢) = 0, the 
number f is an algebraic quantity (over the field of constants 6;) determined by the 
polynomial equation J (—f — 1) = 0, and f is an integer. 

The coefficients J; serve to “connect” the particular function of interest, f(z) to 
the local basis of singular solutions (115). Their determination thus represents a con- 
nection problem (see pp. 470 and 505 for the easier algebraic case). However, contrary 
to what happens for algebraic equations, the determination of the 2; can only be ap- 
proached in all generality by numerical methods [252]. (Even when the coefficients 
dj(z) € Q(z) are rational fractions, no effective procedure is available to decide, from 


'9The forms (118) are appreciably more general than the corresponding ones arising in algebraic 
coefficient asymptotics (Theorem VII.8, p. 501), in which no logarithmic term can be present and the 
exponents are constrained to be rational numbers only. 
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an f(z) € Q[[z] determined by initial conditions at 0, which of the connection coef- 
ficients 2; may vanish.) In many combinatorial applications the calculations can be 
carried out explicitly, in which case the forms (118) serve as a beacon of what to ex- 
pect asymptotically. (Once existence of such forms is granted, e.g., by Theorems VIL9 
and VII.10, it is often possible to identify coefficients and/or exponents in asymptotic 
expansions directly.) Similar considerations apply to functions defined by systems of 
linear differential equations (Note VII.48 below). 

> VIL.46. Multiple singularities. In the case of several singularities ¢),...,¢s, a sum of s 
terms, each of the form (117) with ¢ — ¢;, expresses [z”] f(z). [The structure theorem applies 


at each ¢j and singularity analysis is known to adapt to multiple singularities; cf Section VI. 5, 
p. 398.] <q 


> VII.47. A relaxation. In Theorem VII.10, one may allow the equation to have a singularity 
of any kind at 0. [Only properties of the basis of solutions near ¢ are used.] 


> VIL48. Equivalence between equations and systems. A (first-order) linear differential system 
is by definition 


4 v= AYE, 
dz 


where Y = (Yj,..., Yue is an m-dimensional column vector and A is an m x m coefficient 
matrix. A differential equation of order m can always be reduced to a system of dimension m, 
and conversely. Only rational operations and derivatives are involved in each of the conver- 
sions: technically, coefficient manipulations take place in a differential field K that contains 
coefficients of recurrences and systems. (For instance, the set of rational functions C(z) and the 
set of meromorphic functions in an open set Q are differential fields.) 


The proofs are simple extensions of the case m = 2. Starting from the equation y” + by’ + 
cy = 0, one sets Yj = y, Y2 = y’ to get the system 
{0Y, = Yo, OY = —cY, — bYd}. 
Conversely, given the system 
(OY) =ay1¥1 +a2¥2, OY2 = a2, ¥) + 472V9}, 
let € = VS[Y}, Yo] be the vector space over K spanned by Yj, Y2, which is of dimension < 2. 


Differentiation of the relation OY; = a,,Y, + a12Y2 shows that a2Y 1 can be expressed as 
combination of Y;, Y2, 


e’Y| = a, ¥1 + ayo¥o +411 (iY) + 412¥2) + ay2(a21Y1 + 4222), 


hence a7Y, lies in €. Thus, the system {Y], OY}, aY?} is bound, which corresponds to a differ- 
ential equation of order 2 being satisfied by Y;. (In the case where the coefficient matrix A has 


a simple pole at ¢, singularities of solutions can be studied by matrix methods akin to those of 
Note VII.45.) dq 


Combinatorial applications. The quadtree is a structure, discovered by Finkel 
and Bentley [212], that can be superimposed on any sequence of points in Euclidean 
space R¢. In computer science, it forms the basis of several algorithms for maintaining 
and searching dynamically varying geometric objects [532], and it constitutes a natu- 
ral extension of binary search trees. Quadtrees are associated to differential equations, 
whose order equals the dimension of the underlying space. Some of their major char- 
acteristics can be determined via singularity analysis of these equations [233, 242]. 


Example V1L.23. The plain quadtree. Start from the unit square Q = [0, 1]? and let p = 
(P|, ..., Pn) be a sequence of n points drawn uniformly and independently from Q, with P; = 
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NW NE 


NW NE SW SE 


Figure VII.21. The quadtree splitting process (left, center); a hierarchical partition 
associated to n = 50 random points (right). 


(x;, yj). A quaternary tree, called the quadtree and noted QT(p), is built recursively from p as 
follows: 


— if p is the empty sequence (n = 0), then QT(p) = @ is the empty tree; 

— otherwise, let pyw,PNE; Psw,Ps_E be the four subsequences of points of p that lie, 
respectively, North-West, North-East, South-West, South-East of P,. For instance 
Psw is Psw = (ee Pj, ate5 Pe), where 1 < jy < jo,--: < je <n, and the 
Pi, = (Xje, ¥ je) are those of the points that satisfy the predicate xj, < x, and 
Vie < y1- Then QT(p) is 


QT(p) = (Pi; QT (Pyw), QT(PvE), QTPsw), QTsE)).- 


In other words, the sequence of points induces a hierarchical partition of the space QT; see 
Figure VII.21. (For simplicity, the tree is only defined here for points having different x and 
y coordinates, an event that has probability 1.) 

Quadtrees are used for searching in two related ways: (i) given a point Py = (x0, yo), 
exact search aims at determining whether Pg occurs in p; (ii) given a coordinate xg € [0, 1], a 
partial-match query asks for the set of points P = (x, y) occurring in p such that x = xo (irre- 
spective of the values of y). Both types are accommodated by the quadtree structure: an exact 
search corresponds to descending in the tree, following a branch guided by the coordinates of 
the point Pp that is sought; partial match is implemented by recursive descents into two subtrees 
(either the pair NW, SW or NE, SE) based on the way x9 compares with the x coordinate of 
the root point. 

In an ideal world (for computers), trees are perfectly balanced, in which case the search 
costs satisfy the approximate recurrences, 


(119) fn =1+ fnjas 8n = 14 28n/4, 


for exact search and partial match, respectively. The solutions of these recurrences are © logy n 
and ~ ./n, respectively. To what extent do randomly grown quadtrees differ from the per- 
fect shape, and what is the growth of the cost functions on average? The answer lies in the 
singularities of certain linear differential equations. 
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Exact search. Our purpose is to set up recurrences in the spirit of Subsection VI. 10.3, 
p. 427. We need the probability 7, , that a quadtree of size n gives rise to a NW root-subtree 


of size k and claim that 

1 1 1 
(120) Tn,k = — (An — Hx), Hn =1+24+---+-. 
n 2 n 


Indeed, the probability that £ elements are West of the root and k are North-West is 


-1 1 1 
a2 mer=(, pena) fy f eto =a =x)" aray, 


(The double integral is the probability that the first k elements fall NW, the next ¢ — k fall SW, 
the rest fall either N E or SE; the integrand corresponds to a conditioning upon the coordinates 
(x, y) of the root; the multinomial coefficient takes into account the possible shufflings.) The 
Eulerian Beta integral (p. 747) simplifies the integrals to wy .¢.4 = 1/(n(€ + 1)), from which 
the claimed (120) follows by summation over ¢. 
Given (120), the recurrence 

nal 
(122) Pa =n+4 >. ani Pee Py =0, 

k=0 
with zyx as in (120), determines the sequence of expected value of path length. This recurrence 
translates into the integral equation, 


Zz < dt t du 
(123) P(z)= aap ttf ans f PO) Tw? 


itself equivalent to the linear differential equation of order 2: 
z(1 — z)* PP" (z) + (1 — 2z)(1 — 2)? P’(z) —4(1 — 2)? P(@) = 1 + 3z. 


The homogeneous equation has a regular singularity at z = 1. In such a simple case, it is not 
difficult to guess the “right” solution, which can then be verified by substitution: 


1 14+2z 1 142422 I n+l 
= og 5 Py = Hn - ‘ 
3(1-z)2 “l-z 6(1-2z) 6n 
The ratio P,,/n represents the mean level of a random node in a randomly grown quadtree, a 


quantity which is thus logn + O(1). Accordingly, quadtrees are on average fairly balanced, the 
expected level being within a factor log 4 = 1.38 of the corresponding quantity in a perfect tree. 


P(z) 


Partial match. The analysis of partial match reveals a curious consequence of the imbal- 
ance of quadtrees, where the order of growth differs from that which the perfect tree model (119) 
predicts. The recurrence satisfied by the expected cost of a partial match query is determined 
by methods similar to path length [233]. One finds, by a computation similar to (121), 


4 


(124) On=1+ ES 


n—-1| 
Si@=bH Ox, A =0, 
k=0 


corresponding, for the GF Q(z) = >) Qnz", to the inhomogeneous differential equation, 
LLQ(z)] = 2/(1 — z), where the differential operator £ is 
(125) LLfl= 20 —2)?e* f +20 — 2) ef -4f,. 


201t is also possible, although less convenient, to develop equations starting from basic principles of 
the symbolic method. 
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A particular solution of the inhomogeneous equation is —1/(1 — z), so that y(z) := Q(z) + 
1/(1 — z) satisfies the homogeneous equation L[y] = 0. 

The differential equation £[y] = 0 is singular at z = 0, 1, +00 and it has a regular sin- 
gularity at z = 1. Since one has yy = O(n), by the origin of the problem, the singularity 
at z = 1 is the one that matters. The indicial polynomial can be computed from its definition 
or, equivalently, by simply substituting y = (z — 1)? in the definition of £ and discarding lower 
order terms. One finds, with Z = z —1: 


2[z9| = 0(@ -—1)Z9 —429 +0 (2) 


The roots of the indicial equations are then 


1 1 
je 5 (1-17), > = 5 (+17). 
Theorem VIL.9 guarantees that y(z) admits, near z = 1 a representation of the form 
(126) y@) =A, — 2)" Hy - 1) + 1-2? - 1), 


with H,, H> analytic at 0. 

In order to complete the analysis, we still have to verify that the coefficient 21, which 
multiplies the singular element that dominates as z > 1 is non-zero. Indeed, if we had A; = 0, 
then, one would have y(z) — 0 as z > 1, which contradicts the fact that y, > 1. In other 
words, here: the connection problem is solved by means of bounds that are available from the 
combinatorial origin of the problem. Singularity analysis then yields the asymptotic form of 
yn, hence of Q,. Summarizing , we have: 


Proposition VII.12. Path length in a randomly grown quadtree of size n is on average n logn+ 
O(n). The expected cost of a partial match query satisfies, for some positive kK: 


Pat 
(127) On~x-n*!, a= 3s = 1.56155. 


The analysis extends to quadtrees of higher dimensions [233]. In general dimension d, 
path length is on average Zn logn + O(n). The cost of a partial match query is of the order of 


n? , where f is an algebraic number of degree d. The cost of a random (fully specified) search 
admits a limit Gaussian distribution, as we prove in Example IX.29, p. 687. .............. | 


> VIL.49. Quadtrees and hypergeometric functions. For the plain quadtree (d = 2), the change 


of variables y = (1 — z)? n(z) reduces the differential equation £[y] = 0 to hypergeometric 
form. The constant « in (127) is then found to satisfy 


1T(@a) V17—-1 
kK = -——, a= ——.. 
2T(a)3 2 
Hypergeometric solutions (Note B.15, p. 751) are available for d > 2; see [116, 233, 242]. < 


> VII.50. Closed meanders. A closed meander of size n is a topological configuration de- 
scribing the way a circuit can cross a river 2n times. The sequence starts as 1, 1,2, 8, 42, 262 
(EIS A005315). For instance, here is a meander of size 5: 


526 VII. APPLICATIONS OF SINGULARITY ANALYSIS 


There are good reasons to believe that the number M,, of meanders satisfies 


29 + / 145 
M,~CA"n-?, with B= — 
based on analogies with well-established models of statistical physics [163]. dq 


VII. 9.2. Nonlinear differential equations. Solutions to nonlinear equations do 
not necessarily have singularities that arise from the equation itself (as in the linear 
case). Even the simplest nonlinear equation, 


¥Y'@=Y@),  YO=a, 


has a solution Y(z) = 1/(a@ — z) whose singularity depends on the initial condition 
and is not visible on the equation itself. The problem of determining the location of 
singularities is non-obvious in the case of a nonlinear ODE. Furthermore, the problem 
of determining the nature of singularities for nonlinear equations defies classification 
in general (Note VII.51). In this section, we thus limit ourselves to examining a few 
examples where enough structure is present in the combinatorics, so that fairly explicit 
solutions are available, which are then amenable to singularity analysis. 

> VIL51.A universal differential equation. Following ideas of Rubel [521, 522], Duffin [178] 
proved the following: The differential equation 

(D) Qyl" yi? = Sy y"y! 4 3y =0 


is universal in the sense that any continuous function (x) on R can be approximated with 
arbitrary accuracy by a solution of the equation. Thus, real solutions of nonlinear differential 
equations cannot be “classified” in general. [Proof: (7) construct a third-order differential equa- 
tion (E) satisfied by the class of functions gg4,-(x) =a cos* (bx + c) for —7/2 < bx +ce< 
a /2; (ii) verify that any function G(x) that is a juxtaposition of g functions over disjoint inter- 
vals and is smooth enough satisfies (E); (iii) prove that such a G(x) can be taken so that [ G 
approximates a continuous g(x) to any predetermined accuracy, and determine (D).] J 


Example V1.24. Varieties of increasing trees. Consider a labelled class defined by either of 


(128) Y = Z~ x SEQQ(Y), Y= Z~ xSETQ(Y), 


where a set of integers 2 C Zs has been fixed. This defines trees that are either plane (SEQ) 
or non-plane (SET) and increasing, in the sense that labels go in increasing order along any 
branch stemming from the root. Such trees have been encountered in Subsection II. 6.3 (p. 139) 
in relation to alternating permutations, general permutations, and regressive mappings. 


Enumeration of trees. By the symbolic translation of the boxed product, the EGF of Y 
satisfies a nonlinear differential equation 


z 
(129) Y(z) = [ P(Y(w)) du, 


where the structure function ¢ is 


do)= » y® (case SEQ), d(y) = >> — (case SET). 


wocQ oeQ 


The integral equation (129) is our starting point; in order to unify both cases, we set dy := 
[y°]¢(y). The discussion below is excerpted from the paper of Bergeron, Flajolet, and Salvy [49]. 


First note that (129) is equivalent to the nonlinear differential equation 


(130) Y'(z)=¢(¥@)),  YO)=0, 
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Differential eq. EGF p sing. type coefficient 
A: Yea+ry? — tse Yomnl 
=z 
z Yon+1 2 ont 

B: Y'=14+y¥Y? ¢ ae ie eee Se Seer 

ag 2 On+ tl <> 
Cc: Y'=er log((1—z)7!] 1 logZ Yn =(n—1)! 

1 
D: Voy 1-JI=2% 4 zi? Yn = (2n — 3)! 


Figure VII.22. Some classical varieties of increasing trees: (A) plane binary; (B) 
strict plane binary; (C) increasing Cayley; (D) increasing plane. 


which implies that Y’/¢(Y) = 1 and, upon integrating back, 


Y@) dy 7 : 7 pt dy 
(131) [ G65 ie., K(¥(z))=z, KQ):= 5 oe 


Thus, the EGF Y(z) is the compositional inverse of the integral of the multiplicative inverse of 
the structure function. We can visualize this chain of transformation as follows: 


(132) Y=Inv o / ° = o (¢. 

In simpler situations, the integration defining K (y) in (131) can be carried out explicitly, 
so that explicit expressions may become available for Y(z). Figure VII.22 displays data relative 
to four such classes, the first three of which were already encountered in Chapter II. In each 
case, there is listed: the differential equation (from which the definition of the trees and the form 
of ¢ are apparent), the dominant positive singularity, the singularity type, and the corresponding 
form of coefficients. The general analytic expressions of (131) contain much more: they allow 
for a general discussion of singularity types and permit us to analyse asymptotically classes that 
do not admit of an explicit GF. 

Assume for simplicity ¢ to be an aperiodic entire function (possibly a polynomial). Let 
p be the radius of convergence of Y(z), which is a singular point (by Pringsheim’s Theorem). 
Consider the limiting value Y(p). One cannot have Y(p) < oo since then K (z) being analytic 
at Y(p) would be analytically invertible (by the Implicit Function Theorem). Thus, one must 
have Y(p) = +00 and, since Y and K are inverses of each other, we get K (+00) = p. The 
radius of convergence of Y(z) is accordingly 


0 (1) 
The singularity type of Y(z) is then systematically determined by the rules (132). For a general 
polynomial of degree d > 2, we have (ignoring coefficients) 


(133) p= 


Cd 

K (+00) — K(y) ~ | Sey Hl y@az VED, with Z:= (p—2). 
y 4 

This back-of-the-envelope calculation shows that 


(134) for ¢ a polynomial of degree d : Yn ~ Cant, with f = F=4. 


In the same vein, the logarithmic singularity of the EGF of increasing Cayley trees (Case C 
of Figure VII.22) appears as eventually reflecting the inverse of the exponential singularity of 
o(y) = e. Such a singularity type must then be systematically present when considering 
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increasing non-plane trees (increasing Cayley trees) with a finite collection of node degrees 
excluded—in other words, whenever the SET constructor is used in (128) and Q is a cofinite 
set. This observation “explains” and extends an analysis of [437]. 


Additive parameters. Consider next an additive parameter of trees”! defined by a recur- 
rence, 


(135) s(t) = t+ >. s(v), 

DAT 
where (f,) is a numeric sequence of “tolls” with tg = 0, and the summation v « 7 is carried 
out over all root subtrees v of rt. Introduce the two functions (of cumulated values) 
ziti zn 
S@) = DOr: T() = Di m¥n—, 
Tey n>0 


nN 
so that the ratio aaah equals the mean value of parameter s taken over all increasing trees of 


size n. By simple algebra similar to Lemma VII.1 (p. 457), it is found that the GF S(z) is 
T'(w) 
Y’(w) 


z 
(136) SZ) = v'@ | dw 

0 
The relation (128) defines an integral transform T +» S, which can be viewed as a singularity 
transformer. Thanks to the methods of Subsection VI. 10.3, p. 427, its systematic study can be 
done, once the singularity type of Y(z) is known. 

The discussion of path length (t = n corresponding to T(z) = z¥’(z)) is conducted 
in the present perspective as follows. For polynomial varieties of increasing trees, we have 
Y(z) © Z~° with 6 = 1/(d — 1), so that 

5 5 T’ T! 1 
TxY xe ZO! Tx ZO, eZ}, i ~ | — log Z. 
y! 
Thus, the relation between Y and S is of the simplified form S ~ Y’ log Z. Singularity analysis, 
then implies that average path length is of order n logn. Working out the constants involved 
gives the following proposition. 


Proposition VII.13. Let Y be an increasing variety of trees defined by a function ¢ that is an 
aperiodic polynomial of degree d > 2 and let 6 = 1/(d — 1). The number of trees of size n 
satisfies 


n! 3 —n_—1+06 — - dy _ id 
Yn as (=) » n149, p= ff a. ba = ty"), 


The expected value of path length on a tree of Vy is (6 + 1)nlogn + O(n). 


For naturally occurring models like those of Figure VII.22 and more, many parameters 
of increasing tree varieties can be analysed in a synthetic way (e.g., the degree profile, the 
level profile [49]). What stands out is the type of conceptual reasoning afforded by singularity 
analysis, which provides a direct path to the right order of magnitude of both combinatorial 
counts and basic parameters of structures. After this, it is only a matter of doing the bookkeeping 
and getting the constants right! 2.0.0... ene nett nen e ene e eens | 


2I Such parameters have been investigated in Subsection VI. 10.3 (p. 427): the binary search tree 
recurrence there corresponds exacty to the case d(w) = (1+ w)? here. 
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Example V¥1.25. Polya urn processes. An interesting example of the joint use of nonlinear 
ODEs and singularity analysis is provided by urn processes of probability theory. There, an urn 
may contain balls of different colours. A fixed set of replacement rules is given (one for each 
colour). At any discrete instant, a ball is chosen uniformly at random, its colour is inspected, 
and the corresponding replacement rule is applied. The problem is to determine the evolution 
of the urn at a large instant n. (The book by Johnson and Kotz [357] can serve as an elementary 
introduction to the field; Janson otherwise develops a comprehensive probabilistic approach 
in [349, 351].) In the case of two colours and urns called balanced, it is shown in [130, 225] 
that the generating function of urn histories is determined by a nonlinear first-order autonomous 
system, from which many characteristics of the urn can be effectively analysed. 

In accordance with the informal description above, an urn model with two colours is de- 
termined by a 2 x 2 matrix with integer entries: 


(137) M=(° ‘). a,d€Z, B,y € Zso. 


At any instant, if a ball of the first colour is drawn, then it is placed back into the urn together 
with a balls of the first colour and £ balls of the second colour; similarly, when a ball of the 
second colour is drawn, with y balls of the first colour and 6 balls of the second colour. Negative 
diagonal entries mean that balls are taken out of the urn (rather than added to it). We restrict 
attention to balanced urns, which are such that there exists a, called the balance: 


(138) o=at+fp=yto. 


Given an urn initialized with ag balls of the first colour and bo balls of the second colour, what 
is sought is the multivariate generating function H(x, y, z) (of exponential type), such that 
n}[z" x4 y? ]H (x, y, z) is the number of possible evolutions of the urn leading at time n to an 
um with colour composition (a, b). For 0 > 1, the total number of evolutions is clearly 
1 
(ap + bo)(ag + b9 +9)--- 4g +bo+ (2-10), sothat AU, 1,z)= Gaaa%tbo" 


We have the following proposition. 
Proposition VII.14. The exponential MGF of a balanced urn with matrix (137), balance o, 
and initial composition (ag, bg) satisfies for |xo|, |yo| < 1, xoyo # 9, and |z| < 1/o 

H (x0, Yo, 2) = Xz | x0, yo) YZ I x0, Yo)”, 


where X(t) = X(t |xqg, yo) and Y(t) = Y(t |xo, yo) are the solutions of the associated differ- 
ential system: 


x(n*tly (nF 
(139) ae 5 X(0)=x9, Y(0) = yo. 
—Y(t) = x’ yt! 


> 
ran 
Ss 
VS 
ll 


Proof. The proof is an interesting illustration of the modelling of combinatorial structures by 
differential operators (Note 1.63, p. 88). As a starting point, we observe that the obvious rule 
ax [x"] = nx"! of calculus can be interpreted as 


Ox[xx-++x] = (Ax-- x) 4+ (0A-- +) +--+ x--- f), 


meaning: “pick up in all possible ways a single occurrence of the formal variable and delete it”. 
Similarly, x0, means: “pick up an occurrence without deleting it (this is the pointing operation 
of Subsection I. 6.2, p. 86). 
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Guided by this principle, we associate to an urn the linear partial differential operator 
(140) Di= x@t1yBo, +x? yotla,,. 


If m = x4 y? represents an urn with composition (a, b), then it is easily verified that O[m] 
generates all the possible evolutions of the urn in one step; similarly D”[m] is the generat- 
ing polynomial of the urn’s composition after n steps. This gives us a symbolic form of the 
exponential MGF H as 


n 
(141) H(x,y,2= See, = Df x4 0], 
n>0 , 


Now comes the crucial (and easy) observation that for a solution X(t), Y(t) of the associ- 
ated differential system (139), one has: 


6(X4¥°) = aX4@-ly’yo 4 px4yo-ly’ (by usual differentiation rules) 
= axttaybtB 4 pxatyyto (by system Z) 
= [x“y?| baie (by definition of D). 
yoY 


Induction then provides 
(142) an(xty*) = 9" [x" y?| gsi 
yoY 
In other words: the evolution of the urn is mimicked by the effect of standard differentiation 


applied to solutions of the associated system. 
We can now conclude. We have formally, from (141) and the correspondence D” <> 47, 


n 
z 
H(X(),Y0.2 = Do PIXOVO)I— = X04 DYE +2) 
a0 n! 
(the last form plainly expresses Taylor’s formula). Setting t = 0 yields the statement. | 


As a simple illustration, the Ehrenfest urn (Notes II.11, p. 118 and V.25, p. 336) whose 
matrix is Fei with balance o = 0, only requires solving the associated system 


XH=YO, YO=XH, XO)=x, Y')=yo, 
which provides the explicit form 
H(x, y, z) = (xcoshz + y sinh z)“(x sinh z + y cosh z)"0. 


We only discuss one more example, which is typical of the algebraic solution methods and 
the corresponding singularity analysis. Consider the urn with matrix (5 rey which describes 
the parity of levels in binary increasing trees [130]. Say we start the urn with one ball of the 
first colour and seek the probability that, at time n, all balls are of the second colour. We thus 
need [z”]H (0, 1, z). The associated system is 


xX’=Y*%, Y=x?, X(0)=0, Y(0)=1. 
The system can be solved by a sequence of manipulations (this is general [225]): starting with 
X” =2YY! =2VX'X?, ~—s implying XX = 2X X?, 


we can integrate the last form, so that 


X= (x3 7" 17/3, ie. ie eee =f, 
0. Cheba? jere 
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meaning that X(t) is implicitly determined as the inverse of the integral of an algebraic function. 
In this case, it could be verified that the function X(t) is an elliptic function (see [225, 471] for 
other elliptic models), but its dominant singularity can be directly determined by the methods 
of Example VII.24. The function X (t) is found to become infinite at 


f ae- ea) 
p= =—_F 
0 (+63) WV \3 


by an argument similar to (133), p. 527. A local analysis of the integral combined with inversion 
then reveals that X(t) has a simple pole at p. In addition, we have elementarily X(@t) = 
oX (t) for a> = 1, which entails the existence of three conjugate singularities at p, peit/ 3, 
and pe! 7/3 With the initial conditions (ag, by) = (1,0), the probability that all balls be 
of the second colour at time n is then non-zero only if n = 1 (mod 3) and it is found to be 
exponentially small: for some computable c > 0, there holds 


[z"]X(z) ~ cp", n= 1 (mod 3). 


In [225, 229] it is shown that one can develop along these lines a complete treatment of 
2 x 2 balanced urns and fully characterize the limit distributions involved. ............... | 


> VII.52. Diagrams and combinatorial modelling via differential operators. Define the linear 
differential operator 

Di= x02, 
Its meaning, when applied to a monomial x”, is to pick up two occurrences of x, replace them 


by unity, and then create a new occurrence of x (this is analogous to a one-colour urn model). It 
can thus be represented by a “gate” with two “inputs” and one “output”. The effect of applying 


D" to x" +1 is then to build all the binary trees, whose external nodes are the occurrences of 
the original x-variables and whose internal nodes (the gates) are characterized by their order of 
arrival. Indeed, each particular expansion results in a binary decreasing tree (node labels are 
decreasing from the root; such a tree is clearly isomorphic to an increasing binary tree) with 
distinguished external nodes as in the following example relative ton = 4, 


(In this particular expansion, the first 


(4) application of © is to the first (x,) and 
third (x3) occurrence of x in xxxxx, 

G) corresponding to the first gate (la- 
belled 1), and it creates one new occur- 


rence of x (the output link of gate 1). 


C1) (2) The second application is to x2, x4 
(gate 2). The third application is to x5 


and to the x produced by gate 1; and so 


X5 x1 x3 x2 X4 on.) 

Consequently: 
rw l ntl 
Q" [ de | =nl(n+ 1)!x, equivalently, —H" =1 
4 ee : a Gana 
Thus, one obtains the EGF of decreasing trees, i.e., permutations, via the coefficient of x in 
1 2" Hi 
ele] =14+x fe Se tbs 


l-z° 2!(1—z)? 
Other operators that may be considered include 
D=x+0, xd, x? 4 0?, x03, x07 +.x0,... : 


It is fascinating to try and model as many classical combinatorial structures as possible in this 
way, via differential operators and systems of gates. (This exercise was suggested by works 
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of Btasiak, Horzela, Penson, Duchamp, and Solomon [73, 74], themselves motivated by the 
“boson normal ordering problem” of quantum physics.) dq 

To conclude this section, it is of interest to compare the properties of increasing 
trees (Example VII.24) and of simple varieties of trees (Subsection VII. 3.2, p. 455). 
The conclusion is that simple trees are of the “square-root” type, in the sense that the 
typical depth of a node and the expected height are of order ,/n. By contrast, increas- 
ing trees, which are strongly bound by an order constraint, have logarithmic depth and 
height [157, 158, 160]—they belong to a “logarithmic” type. From a singular per- 
spective, simple trees are associated to the universal Z!/* law, while increasing trees 
exhibit a divergence behaviour (Z~!/¢—) in the polynomial case). Tolls then affect 
singularities of GFs in rather different ways: through a factor Z~!/? for simple trees, 
through a factor log Z in the case of increasing trees. Such abstract observations are 
typical of the spirit of analytic combinatorics. 

A spectacular result in the general area of random discrete structures and nonlin- 
ear differential equations is the discovery by Baik, Deift, and Johansson (Note VIII.46, 
p. 598) of the law governing the longest increasing subsequence in a random permuta- 
tion. There, the solutions of the nonlinear Painlevé equation uw” (x) = 2u(x)? + xu(x) 
play a central réle. 


VII. 10. Singularity analysis and probability distributions 


Singularity analysis can often be used to extract information about the probabil- 
ity distribution of a combinatorial parameter. In the central sections of Chapter IX 
(pp. 650-666), we shall develop perturbation methods grafted on singularity analysis, 
which are applicable given a bivariate generating function F(z, u), provided it can be 
continued when u lies in a complex neighbourhood of |. However, such conditions 
are not always satisfied. First, it may be the case that F(z, u) is defined for no other 
value than z = 0 (it diverges), as soon as u > 1. Second, it may be the case that 
a parameter is accessible via a collection of univariate GFs rather than a BGF (see 
typically our discussion of extremal parameters in Section III. 8, p. 214). We briefly 
indicate in this section ways to deal with such situations. 


VII. 10.1. Moment pumping. Our reader should have no difficulty in recogniz- 
ing as familiar at least the first two steps of the following procedure, nicknamed “mo- 
ment pumping” in [249], which serve to extract moments from bivariate generating 
functions. 


Procedure: Moment Pumping 

Input: A bivariate generating function F(z, uw) determined by a functional equation. 

Output: The limit law corresponding to the array of coefficients [z"u*]F(z, u); that is, the 
asymptotic probability distribution of a parameter y ona class Fy. 

Step 1. Elucidate the singular structure of F(z, 1) corresponding to the counting prob- 
lem [z”] F(z, 1). (Tools of Chapters IV—VII are well-suited for this task, the functional equation 
satisfied by F(z, 1) being usually simpler than that of F(z, w).) 

Step 2. Work out the singular structure (main terms) of each of the partial derivatives 

r 


a 
Mr (z) = aur t u) 


u=1 
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for r = 1,2,..., and use meromorphic methods or singularity analysis to conclude as to 
[z"]ur(z). If, as it is most often the case, the combinatorial parameter marked by wu is of 
polynomial growth in the size n, then the radius of convergence of each su; is a priori the same 
as that of F(z, 1). Furthermore, in many cases, the singular structure of the ;(z) is of the same 
broad type as that of wo(z) = F(z, 1). 

Step 3. From the moments, as given by Step 2, attempt to reconstruct the limit distribution 
using the Moment Convergence Theorem (Theorem C.2, p. 778). 


In order for the procedure to succeed”, we typically need the standard deviation of y 
to be of the same order as the mean, which necessitates that the distribution is spread in 
the sense of Chapter III, p. 161. (Otherwise, there are larger and larger cancellations 
in moments of the centred and scaled variant of y, so that the analysis requires an 
unbounded number of terms in the singular expansions of the GFs y;(z); see also 
Pittel’s study [484] for an insightful discussion of related problems.) 


Example VYI.26. The area under Dyck excursions. We now examine the coefficients in the 
BGF, which is a solution of the functional equation 

1 
1 — 2F(qz, 4)’ 


It is such that [z"'gk ] F(Z, q) represents the number of Dyck excursions of length 2n and area k— 
n (p. 330). Thus we are aiming at characterizing the distribution of area in Dyck paths. We set 


(143) F(z,q)= ie, F(z,q) =14+72F,q) F(z, 4). 


Mr (Z) = og F (z, | v which is, up to normalization, the GF of the rth factorial moments. 
q= 


Clearly, 9 satisfies the relation “9 = 1 + zu, and 9 = % (1 -vVJ/1- 4z), as anticipated. 
Application of the moment pumping procedure leads to a collection of equations, 


2 / 
Hy = 22zuour +z HOKO ; ; 
Hy = 2zpou. + 2zmy + 2z* pM + 22" HoH + HOLD, 


and so on. Precisely, the shape of the equation giving uy, forr > 1, is 


r J ‘ 
is J 
(144) Hr =z) ("Jerr d (7) efetie st 
j=0 J k=0 


as results, upon setting g = 1, from Leibniz’s product rule and a computation of the derivatives 
a4 F (qz, q). In particular, each 1, can be expressed from the previous and their derivatives, 
since the equation relative to “, is of the linear form uw, = 2zuguy +---, So that u,(z) isa 
rational form in z and 6 := 1 — 4z. An examination of the initial values of the w then suggests 
that, in terms of dominant singular asymptotics, as z > i there holds 


ar ~ 42) Gr-2)/2 
(145) trl) = 7s + (a = 42) ie eA, 


a property that is readily verified by induction. (In such situations, the closure of functions of 
singularity analysis class under differentiation, p. 419, proves handy.) In particular, by singu- 
larity analysis, the mean and standard deviation of y on Fy, are each of order 3/2, 

Now, equipped with (145), we can trace back the main singular contributions in (144), 
noting that the “weight”, as measured by the exponent of (1 — 4z)~!, of the term in (144) 


22The important Gaussian case, which is mostly excluded by moment pumping, tends to yield agree- 
ably to the perturbation methods of Chapter IX, so that the univariate methods discussed here and those of 
Chapter IX are indeed complementary. 
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corresponding to generic indices j, k is (3r — k — 2)/2. Then, by identifying the corresponding 
coefficients, we come up with the recurrence valid for r > 2 


r-1 


1 r rBr — 1) 
(146) ae» (aria; + Are 
j=l 
(the linear term arises from j = r,k = 1) and from (145) and (146), the shape of factorial 
moments, hence that of the usual power moments, results by plain singularity analysis: 


A 
147 En (v") ~ M,n?/2, M, := GIDE 
( ) n (x ) rn r T(Gr—1/2) 
It can then be verified [568] that the moment M,. uniquely characterize a probability distribution 
(Appendix C.5: Convergence in law, p. 776). 


Proposition VII.15. The distribution of area y in Dyck excursions, scaled by n—3/2 


verges to a limit, known as the Airy”? distribution of the area type, which is determined by its 
moments M,, as specified by (146) and (147). In other terms, there exists a distribution function 
H(x) supported by Ry 9 such that limy—soo Pn(y < xn3/*) = H(x). 


» CONn- 


Due to the exact correspondence between Dyck excursions and trees, the same limit dis- 
tribution occurs for path length in general Catalan trees. Proposition VII.15 is originally due to 
Louchard [415, 416], who developed connections with Brownian motion—the limit distribution 
is indeed up to normalization that of Brownian excursion area. (The approach presented here 
also has the merit of providing finite n corrections.) Our moment pumping approach largely 
follows the lines of Takacs’ treatment [568]. The recurrence relation (144) can furthermore be 
solved by generating functions, to the effect that the A; entertain intimate relations with the 
Airy function: for surveys, see [244, 352]. Curiously, the Wright constants arising in the enu- 
meration of labelled graphs of fixed excess (the Px, (1) of p. 134) appear to be closely related to 
the moments M,-: this fact can be explained combinatorially by means of breadth-first search of 
graphs, as noted by Spencer [548]. 0... cece eee eee eens | 


> VIL53. Path length in simple varieties of trees. Under the usual conditions on ¢, the limit 
distribution is an Airy distribution of the area type, as shown by Takacs [566]. dq 


> VIL54. A parking problem IJ. This continues Example II.19, p. 146. Consider m cars and 
condition by the fact that everybody eventually finds a parking space and the last space remains 
empty. Define total displacement as the sum of the distances (over all cars) between the initially 
intended parking location and the first available space. The analysis reduces to the difference- 
differential equation [249, 380], which generalizes (65), p. 146, 


g F > _ F - 
CE = F(z,q)- Gq) — gFqz D 
Zz ia 


Moment pumping is applicable [249]: the limit distribution is once more an Airy (of area type). 
This problem arises in the analysis of the linear probing hashing algorithm [380, §6.4] and is of 
relevance as a discrete version of important coalescence models. It is also shown in [249] based 
on [285] that the number of inversions in a Cayley tree is asymptotically Airy. 


23 The Airy function Ai(z) is of hypergeometric type and is closely related to Bessel functions of 
order +1/3. It is defined as the solution of y” — zy = 0 satisfying Ai(0) = 372/3/T(2/3) and Ai/(0) = 
3-1/3 /T(1/3); see [3, 604] for basic properties. The A; intervene in the expansion of log Ai(z) at 
infinity [244, 352]. After Louchard and Takacs, the distribution function H (x) can be expressed in terms of 
confluent hypergeometric functions and zeros of the Airy function. 
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> VII.55. The Wiener index and other functionals of trees. The Wiener index, a structural index 
of interest to chemists, is defined as the sum of the distances between all pairs of nodes in a tree. 
For simple families, as shown by Janson [348], it admits a limit distribution. (Similar properties 
hold for many additive functionals of combinatorial tree families [210]. As regards moment 
pumping, the methods are also related to those of Subsection VI. 10.3, p. 427, dedicated to tree 
recurrences.) <i 


> VII.56. Difference equations, polyominoes, and limit laws. Many of the q—difference equa- 
tions that are defined by a polynomial relation between F(z, q), F(qz, q),... (and even sys- 
tems) may be analysed, as shown by Richard [509, 510]. This covers several models of polyomi- 
noes, including the staircase, the horizontally-vertically convex, and the column convex ones. 
Area (for fixed perimeter) is asymptotically Airy distributed. It is from these and similar results, 
supplemented by extensive computations based on transfer-matrix methods, that Guttmann and 
the Melbourne school have been led to conjecturing that the limit area of self-avoiding polygons 
(closed walks) in the plane is Airy (see our comments on p. 365). 


> VII.57. Path length in increasing trees. For binary increasing trees, the analysis of path 
length reduces to that of the functional equation, 


Zz 
F(z,q)=1 +f F(qt,q)° dt. 
0 


There exists a limit law, as first shown by Hennequin [328] using moment pumping, with al- 
ternative approaches due to Régnier [505] and Rosler [517]. This law is important in computer 
science, since it describes the number of comparisons used by the Quicksort algorithm and in- 
volved in the construction of a binary search tree. The mean is 2n logn + O(n), the variance 


is ~~ (7 —4¢ (2))n?, and the moment of order r of the limit law is a polynomial form in zeta 
values ¢(2),..., ¢(r). See [209] for recent news and references. J 


VII. 10.2. Families of generating functions. There is no logical obstacle to ap- 
plying singularity analysis to a whole family of functions. In a way, this is similar to 
what was done in Chapter V when analysing longest runs in words (p. 308) and the 
height of general Catalan trees (p. 326), in the simpler case of meromorphic coeffi- 
cient asymptotics. One then needs to develop suitable singular expansions together 
with companion error terms, a task that may be technically demanding when GFs are 
given by nonlinear functional relations or recurrences. We illustrate below the situa- 
tion by an apercu of the analysis of height in simple varieties of trees. 


Example V¥1.27. Height in simple varieties of trees. The recurrence 


(148) yo(z) = 9, Yat) = 1+ zyn(z)* 


is such that y,(z) is the OGF of binary trees of height less than h, with size measured by the 
number of binary nodes (Example III.28, p. 216). Each y,(z) is a polynomial, with deg(y,) = 
2-1 _ 1, Some technical difficulties are to be expected since the yy, have no singularity at a 


finite distance, whereas their formal limit y(z) is the OGF of Catalan number, 
1 
ye) = 5 (1 mae =), 
z 


which has a square-root singularity at z = 1/4. As a matter of fact, the sequence wy = zyp 
satisfies the recurrence wp4, = z+ we, which was made famous by Mandelbrot’s studies and 
gives rise to amazing graphics [473]; see Figure VII.23 for a poor man’s version. 

When |z| < r < 1/4, simple majorant series considerations show that the convergence 
yn (Zz) 2 y(z) is uniformly geometric. When z > s > 1/4, it can be checked that the yy (z) 
grow doubly exponentially. What happens in-between, in a A—domain, needs to be quantified. 
We do so following Flajolet, Gao, Odlyzko, and Richmond [230, 246]. 
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1.0 P The grey level relative to a point z = 
x+y in the diagram indicates the num- 
ber of iterations necessary for the GFs 
yp(z) either to diverge to infinity (the 
outer, darker region) or to the finite limit 
y(z) (the inner region, corresponding to 
the Mandelbrot set, with the darker area 
around 0 corresponding to faster con- 
vergence). The cardioid-shaped region 
defined by |1 — e(z)| < 1 is a guaran- 
teed region of convergence, beyond the 
circle |z| = 1/4. The determination of 
height reduces to finding what goes on 
near the cusp z = 1/4 of the cardioid. 


0.0 


1.0 


Figure VII.23. The GFs of binary trees of bounded height: speed of convergence. 


Starting from the basic recurrence (148), we have 


Y— yng = 20? — yg) = zy — yn) Qy — (y — yn), 


which rewrites as 


1 
(149) enti = 2zy)en(l—en), where ep(z) = - ye? ~ ya(z) 


is proportional to the OGF of trees having height at least h. (The function x 1» Ax(1 — x), 
which is at the basis of the recurrence (149), is also known as the logistic map; its iterates, for 
real parameter values /, give rise to a rich diversity of patterns.) 

First, let us examine what happens right at the singularity 1/4 and consider e;, = en(4)- 
The induced recurrence is 


(150) ent+l = en (1 — en), with eg= 7 


whose solution decreases monotonically to 0 (argument: otherwise, there would need to be a 
fixed point in (0, 1)). This form resembles the familiar recurrence associated with the solution 
by iteration of a fixed-point equation £ = f(€), but here it corresponds to an “indifferent” 
fixed-point, f’(€) = 1, which precludes the usual geometric convergence. A classical trick of 
iteration theory, found in de Bruijn’s book [143, §8.4], neatly solves the problem. Consider 
instead the quantities f;, := 1/e;,, which satisfy the induced recurrence 


1 1 
(151) pa ae ee with fo = 2. 
L= i, fh Sir 
This suggests that fj, ~ h. Indeed, by a terminating form of (151), 
1 i te 
032) Far= fit e+ ies fn = 24D 4 — 
Sh 1- fr 41 F; 


one can derive properties of the sequence (f},) by “bootstrapping”: the fact that “ > h implies 
that the first sum in (152) is O(log h), while the second one is O(1); then, another round serves 
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to refine the estimates, so that, for some C: 
logh 
th =h+logh+C+0O = 


and the behaviour of e, = 1/fp is now well quantified. 
The analysis for z 4 1/4 proceeds along similar lines. We set ¢ = é(z) := /1 — 4z and 
again abbreviate e, (z) as e,. Upon considering 


th —= fa = eh 
and taking inverses, we obtain 
nei, 
(153) fn41 = fat+(l— 0) + #4, 
l-e, 
Proceeding as before leads to the general approximation 
e(z)(1 — e(z))" 
(154) enc) ~ SOOO ogy: VINK, 
1 (L=e(z))* 


proved to be valid for any fixed z € (0, 1/4), as h > oo. This approximation is compatible 
both with e, (1/4) ~ 1/h (derived earlier) and with the geometric convergence of y,(z) to y(z) 
valid for 0 < z < 1/4. With some additional work, it can be proved that (154) remains valid as 
zZ7 i in a A-domain and as h — ov; see Figure VII.23. Obtaining the detailed conditions 
on (z, 4), together with a uniform error term for (154), is the crux of the analysis in [247]. 

From this point on, we content ourselves with brief indications on subsequent develop- 
ments. Given (154), one deduces”* that the GF of cumulated height satisfies 


e(l—«)! 1 
H(2) = 2y@) Den) ~4 >) Gap ew 


h>0 hel 


aszZ—-> i Thus, by singularity analysis, one has 


= [z"]H(z) ~~ 2-4" /n, 


1 
H(z) ~ 2log ; 


which gives the expected height [z”]H(z)/[z"]y(z) of a binary tree of size n as ~ 2,/xn. 
Moments of higher order can be similarly analysed. 

It is of interest to note that the GFs that surface explicitly in the analysis of height in 
general Catalan trees (eventually due to the continued fraction structure and the implied linear 
recurrences) appear here as analytic approximations in suitable regions of the complex plane. 
A precise form of the approximation (154) can also be subjected to singularity analysis, to the 
effect that the same Theta law expresses in the asymptotic limit the distribution of height in 
binary trees. Finally, the technique can be extended to all simple varieties of trees satisfying the 
smooth inverse-function schema (Theorem VII.2, p. 453). In summary, we have the following 
proposition [230, 246]. 

Proposition VII.16. Let Y be a simple variety of trees satisfying the conditions of Theo- 
rem VII.2, with $ the basic tree constructor and t the root of the characteristic equation 
b(t) — th(t) = O. Let yx denote tree height. Then the rth moment of height satisfies 
Vw = rr/2 SS 29'(r)? 
Ey IyIl~y~r@-DIPe/2)cMen”, eo pe") 


241m order to obtain the logarithmic approximation of H(z), one can for instance appeal to Mellin 


transform techniques in a way parallel to the analysis of general Catalan trees (p. 326): set 1 — e(z) =e". 
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The normalized height x /,/En converges to a Theta law, both in distribution and in the sense 
of a local limit law. 

(The Theta distribution is defined in (67), p. 328; Chapter IX develops the notions of con- 
vergence in law and of local limits much further.) In particular the expected height in general 
Catalan trees [145], binary trees, unary—binary trees, pruned t—ary trees, and Cayley trees [507], 
is found to be, respectively, asymptotic to 


Jan, 2J/an, V3an, VJ2xt/(t—1), V2zxn, 


and a pleasant universality phenomenon manifests itself in the height of simple trees. 
A somewhat related analysis of a polynomial iteration in the vicinity of a singularity yields 
the asymptotic number of balanced trees (Note IV.49, p. 283). 2.0.0.0... cece eee eee eee | 


VII. 11. Perspective 


The theorems in this chapter demonstrate the central rdle of the singularity ana- 
lysis theory developed in Chapter VI, this in a way that parallels what Chapter V did 
for Chapter IV with meromorphic function analysis. Exploiting properties of complex 
functions to develop coefficient asymptotics for abstract schemas helps us solve whole 
collections of combinatorial constructions at once. 

Within the context of analytic combinatorics, the results in this chapter have broad 
reach, and bring us closer to our ideal of a theory covering full analysis of combi- 
natorial objects of any “reasonable” description. Analytic side conditions defining 
schemas often play a significant rdle. Adding in this chapter the mathematical support 
for handling set constructions (with the exp—log schema) and context-free construc- 
tions (with coefficient asymptotics of algebraic functions) to the support developed 
in Chapter V to handle the sequence construction (with the supercritical sequence 
schema) and regular constructions (with coefficient asymptotics of rational functions) 
gives us general methods encompassing a broad swathe of combinatorial analysis, 
with a great many applications (Figure VII.24). 

Together, the methods covered in Chapter V, this chapter, and, next, Chapter VIII 
(relative to the saddle-point method) apply to virtually all of the generating functions 
derived in Part A of this book by means of the symbolic techniques defined there. 
The SEQ construction and regular specifications lead to poles; the SET construction 
leads to algebraic singularities (in the case of logarithmic generators discussed here) or 
to essential singularities (in most of the remaining cases discussed in Chapter VIII); 
recursive (context-free) constructions lead to square-root singularities. The surpris- 
ing end result is that the asymptotic counting sequences from all of these generating 
functions have one of just a few functional forms. This universality means that com- 
parisons of methods, finding optimal values of parameters, and many other outgrowths 
of analysis can be very effective in practical situations. Indeed, because of the nature 
of the asymptotic forms, the results are often extremely accurate, as we have seen 
repeatedly in this book. 

The general theory of coefficient asymptotics based on singularities has many ap- 
plications outside of analytic combinatorics (see the notes below). The broad reach of 
the theory provides strong indications that universal laws hold for many combinatorial 
constructions and schemas yet to be discovered. 
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Combinatorial Type 


coeff. asymptotics (subexp. term) 


Rooted maps n—>/2 §VII. 8.2 
Unrooted trees n—>/2 §VIIL.5 

Rooted trees n—3/2 §VII. 3, §VIL 4 
Excursions n—3/2 §VII. 8.1 
Bridges nt/2 §VIL8.1 
Mappings gel §VIL. 3.3 
Exp-log sets nel §VII.2 
Increasing d—ary trees n—(@-2)/(d-1) § VIL. 9.2 


Analytic form singularity type coeff. asymptotics 

Positive irred. (polynomial syst.) Ze ae ae §VIIL.6 
General algebraic Zeid a §VII.7 
Regular singularity (ODE) Z(logZ)6 "nl dogn)®—- § VII. 9.1 


Figure VII.24. A collection of universality laws summarized by the subexponential 
factors involved in the asymptotics of counting sequences (top). A summary of the 
main singularity types and asymptotic coefficient forms of this chapter (bottom). 


Bibliographic notes. The exp—log schema, like its companion, the supercritical-sequence 
schema, illustrates the level of generality that can be attained by singularity analysis techniques. 
Refinements of the results we have given can be found in the book by Arratia, Barbour, and 
Tavaré [20], which develops a stochastic process approach to these questions; see also [19] by 
the same authors for an accessible introduction. 

The rest of the chapter deals in an essential manner with recursively defined structures. As 
noted repeatedly in the course of this chapter, recursion is conducive to square-root singularity 
and universal behaviours of the form n~3/?. Simple varieties of trees have been introduced 
in an important paper of Meir and Moon [435], that bases itself on methods developed earlier 
by Pélya [488, 491] and Otter [466]. One of the merits of [435] is to demonstrate that a high 
level of generality is attainable when discussing properties of trees. A similar treatment can be 
inflicted more generally to recursively defined structures when their generating functions satisfy 
an implicit equation. In this way, non-plane unlabelled trees are shown to exhibit properties 
very similar to their plane counterparts. It is of interest to note that some of the enumerative 
questions in this area had been initially motivated by problems of theoretical chemistry: see the 
colourful account of Cayley and Sylvester’s works in [67], the reference books by Harary and 
Palmer [319] and Finch [211], as well as Pélya’s original studies [488, 491]. 


Algebraic functions are the modern counterpart of the study of curves by classical Greek 
mathematicians. They are either approached by algebraic methods (this is the core of algebraic 
geometry) or by transcendental methods. For our purposes, however, only rudiments of the 
theory of curves are needed. For this, there exist several excellent introductory books, of which 
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we recommend the ones by Abhyankar [2], Fulton [273], and Kirwan [365]. On the algebraic 
side, we have aimed at providing an introduction to algebraic functions that requires minimal 
apparatus. At the same time the emphasis has been put somewhat on algorithmic aspects, since 
most algebraic models are nowadays likely to be treated with the help of computer algebra. 
As regards symbolic computational aspects, we recommend the treatise by von zur Gathen and 
Gerhard [599] for background, while polynomial systems are excellently reviewed in the book 
by Cox, Little, and O’Shea [135]. 

In the combinatorial domain, algebraic functions have been used early: in Euler and Seg- 
ner’s enumeration of triangulations (1753) as well as in Schréder’s famous “Vier combina- 
torische Probleme’ described by Stanley in [554, p. 177]. A major advance was the realization 
by Chomsky and Schiitzenberger that algebraic functions are the “exact” counterpart of context- 
free grammars and languages (see their historic paper [119]). A masterful summary of the early 
theory appears in the proceedings edited by Berstel [54] while a modern and precise presenta- 
tion forms the subject of Chapter 6 of Stanley’s book [554]. On the analytic asymptotic side, 
many researchers have long been aware of the power of Puiseux expansions in conjunction with 
some version of singularity analysis (often in the form of the Darboux—Pélya method: see [491] 
based on Pélya’s classic paper [488] of 1937). However, there appeared to be difficulties in cop- 
ing with the fully general problem of algebraic coefficient asymptotics [102, 440]. We believe 
that Section VII. 7 sketches the first complete theory (though most ingredients are of folklore 
knowledge). In the case of positive systems, the ““Drmota—Lalley—Woods” theorem is the key to 
most problems encountered in practice—its importance should be clear from the developments 
of Section VIL. 6. 


The applicability of algebraic function theory to context-free languages has been known 
for some time (e.g., [220]). Our presentation of one-dimensional walks of a general type follows 
articles by Lalley [396] and Banderier and Flajolet [27], which can be regarded as the analytic 
pendant of algebraic studies by Gessel [286, 287]. The kernel method has its origins in prob- 
lems of queueing theory and random walks [202, 203] and is further explored in an article by 
Bousquet-Mélou and Petkovsek [86]. The algebraic treatment of random maps by the quadratic 
method is due to brilliant studies of Tutte in the 1960s: see for instance his census [579] and 
the account in the book by Jackson and Goulden [303]. A combinatorial—analytic treatment of 
multiconnectivity issues is given in [28], where the possibility of treating in a unified manner 
about a dozen families of maps appears clearly. 


Regarding differential equations, an early (and at the time surprising) occurrence in an 
asymptotic expansion of terms of the form n%, with a an algebraic number, is found in the 
study [252], dedicated to multidimensional search trees. The asymptotic analysis of coeffi- 
cients of solutions to linear differential equations can also, in principle, be approached from the 
recurrences that these coefficients satisfy. Wimp and Zeilberger [611] propose an interesting 
approach based on results by George Birkhoff and his school (e.g., [70]), which are relative to 
difference equations in the complex plane. There are, however, some doubts among special- 
ists regarding the completeness of Birkhoff’s programme (see our discussion in Section VIII. 7, 
p. 581). By contrast, the (easier) singularity theory of linear ODEs is well established, and, as 
we showed in this chapter, it is possible—in the regular singular case at least—to base a sound 
method for asymptotic coefficient extraction on it. 


VIII 


Saddle-point Asymptotics 


Like a lazy hiker, the path crosses the ridge at a low point; 
but unlike the hiker, the best path takes the steepest ascent to the ridge. 
[---] The integral will then be concentrated in a small interval. 


— DANIEL GREENE AND DONALD KNUTH [310, sec. 4.3.3] 
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A saddle-point of a surface is a point reminiscent of the inner part of a saddle or of a 
geographical pass between two mountains. If the surface represents the modulus of an 
analytic function, saddle-points are simply determined as the zeros of the derivative 
of the function. 

In order to estimate complex integrals of an analytic function, it is often a good 
strategy to adopt as contour of integration a curve that “crosses” one or several of 
the saddle-points of the integrand. When applied to integrals depending on a large 
parameter, this strategy provides in many cases accurate asymptotic information. In 
this book, we are primarily concerned with Cauchy integrals expressing coefficients of 
large index of generating functions. The implementation of the method is then fairly 
simple, since integration can be performed along a circle centred at the origin. 

Precisely, the principle of the saddle-point method for the estimation of contour 
integrals is to choose a path crossing a saddle-point, then estimate the integrand lo- 
cally near this saddle-point (where the modulus of the integrand achieves its maximum 
on the contour), and deduce, by local approximations and termwise integration, an 
asymptotic expansion of the integral itself. Some sort of “localization” or “concentra- 
tion” property is required to ensure that the contribution near the saddle-point captures 
the essential part of the integral. A simplified form of the method provides what are 
known as saddle-point bounds—these useful and technically simple upper bounds are 
obtained by applying trivial bounds to an integral relative to a saddle-point path. In 
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many cases, the saddle-point method can furthermore provide complete asymptotic 
expansions. 

In the context of analytic combinatorics, the method is applicable to Cauchy co- 
efficient integrals, in the case of rapidly varying functions: typical instances are entire 
functions as well as functions with singularities at a finite distance that exhibit some 
form of exponential growth. Saddle-point analysis then complements singularity ana- 
lysis whose scope is essentially the category of functions having only moderate (i.e., 
polynomial) growth at their singularities. The saddle-point method is also a method 
of choice for the analysis of coefficients of large powers of some fixed function and, 
in this context, it paves the way to the study of multivariate asymptotics and limiting 
Gaussian distributions developed in the next chapter. 

Applications are given here to Stirling’s formula, as well as the asymptotics of the 
central binomial coefficients, the involution numbers and the Bell numbers associated 
to set partitions. The asymptotic enumeration of integer partitions is one of the jewels 
of classical analysis and we provide an introduction to this rich topic where saddle- 
points lead to effective estimates of an amazingly good quality. Other combinatorial 
applications include balls-in-bins models and capacity, the number of increasing sub- 
sequences in permutations, and blocks in set partitions. The counting of acyclic graphs 
(equivalently forests of unrooted trees), finally takes us beyond the basic paradigm of 
simple saddle-points by making use of multiple saddle-points, also known as “monkey 
saddles’. 


Plan of this chapter. First, we examine the surface determined by the modulus 
of an analytic function and give, in Section VIII. 1, a classification of points into three 
kinds: ordinary points, zeros, and saddle-points. Next we develop general purpose 
saddle-point bounds in Section VIII.2, which also serves to discuss the properties 
of saddle-point crossing paths. The saddle-point method per se is presented in Sec- 
tion VIII.3, both in its most general form and in the way it specializes to Cauchy 
coefficient integrals. Section VIII.4 then discusses three examples, involutions, set 
partitions, and fragmented permutations, which help us get further familiarized with 
the method. We next jump to a new level of generality and introduce in Section VIII. 5 
the abstract concept of admissibility—this approach has the merit of providing easily 
testable conditions, while opening the possibility of determining broad classes of func- 
tions to which the saddle-point method is applicable. In particular, many combinato- 
rial types whose leading construction is a SET operation are seen to be “automatically” 
amenable to saddle-point analysis. The case of integer partitions, which is technically 
more advanced, is treated in a separate section, Section VIII.6. The saddle-method 
is also instrumental in analysing coefficients of many generating functions implicitly 
defined by differential equations, including holonomic functions: see Section VHI. 7. 
Next, the framework of “large powers”, developed in Section VII. 8 constitutes a 
combinatorial counterpart of the central limit theorem of probability theory, and as 
such it provides a bridge to the study of limit distributions to be treated systematically 
in Chapter IX. Other applications to discrete probability distributions are examined 
in Section VHI.9. Finally, Section VIII. 10 serves as a brief introduction to the rich 
subject of multiple saddle-points and coalescence. 
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VIII. 1. Landscapes of analytic functions and saddle-points 


This section introduces a well-known classification of points on the surface rep- 
resenting the modulus of an analytic function. In particular, as we are going to see, 
saddle-points, which are determined by roots of the function’s derivative, are associ- 
ated with a simple geometric property that gives them their name. 

Consider any function f(z) analytic for z € Q, where Q is some domain of C. Its 
modulus | f (x +iy)| can be regarded as a function of the two real quantities, x = R(z) 
and y = S(z). As such, it can be represented as a surface in three-dimensional space. 
This surface is smooth (analytic functions are infinitely differentiable), but far from 
being arbitrary. 

Let zo be an interior point of Q. The local shape of the surface | f (z)| for z near zo 
depends on which of the initial elements in the sequence f (zo), f’(zo), f’’ (Zo), - ++ 
vanish. As we are going to see, its points can be of only one of three types: ordinary 
points (the generic case), zeros, and saddle-points; see Figure VIII.1. The classifi- 
cation of points is conveniently obtained by considering polar coordinates, writing 
Z=zot rei? with r small. 


An ordinary point is such that f (zo) 4 0, f’(zo) 4 0. This is the generic situation 
as analytic functions have only isolated zeros. In that case, one has, for small r > 0, 


(1) If @1=|FGo) + re” f/Go) + 0?) = LF Go)||L + Are) + O@)), 
where we have set f’(zo)/f (zo) = Ae!?, with 2 > 0. The modulus then satisfies 


Lf @)| = Lf Go)| (1 + Ar cos( + 6) + O)). 


Thus, for r kept small enough and fixed, as 6 varies, | f(z)| is maximum when 0 = 
—¢ (where it is ~ |f(zo)| + Ar)), and minimum when 6 = —¢+ a (where it is 
~ |f (zo) — Ar)). When 0 = —¢ + ©, one has | f(z)| = |f(zo)| + o(r), which 
means that | f(z)| is essentially constant. This is easily interpreted: the line 0 = — 
(mod z) is (locally) a steepest descent line, the perpendicular line 9 = —¢ + 5 
(mod 7) is locally a level line. In particular, near an ordinary point, the surface | f (z)| 
has neither a minimum nor a maximum. In figurative terms, this is like standing on 
the flank of a mountain. 


A zero is by definition a point such that f(zo) = 0. In this case, the function 
| f (z)| attains its minimum value 0 at zo. Locally, to first order, one has | f(z)| ~ 
| f’(zo)|r for a simple zero and | f (z)| = O(r”™) or a zero of order m. A zero is thus 
like a sink or the bottom of a lake, save that, in the landscape of an analytic function, 
all lakes are at sea level. 


A saddle-point is a point such that f(zo) 4 0, f’(zo) = 0; it thus corresponds 
to a zero of the derivative, when the function itself does not vanish. It is said to be 
a simple saddle-point if furthermore f”(zo) 4 0. In that case, a calculation similar 
to (1), 

(2) 


FL = | Feo) + 5722 Feo) + O0)| = LF Go)l | + a2 + 069), 
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Ordinary point Zero Saddle-point 
Fo) #0, f’(Zo) #0 fo) =0 fo) #9, f’(Zo) = 0 
f"(@o) #0 


Figure VIII.1. The different types of points on a surface | f(z)|: an ordinary point, 
a zero, a simple saddle-point. Top: a diagram showing the local structure of level 
curves (in solid lines), steepest descent lines (dashed with arrows pointing towards the 
direction of increase) and regions (hashed) where the surface lies below the reference 
value | f (zg)|. Bottom: the function f(z) = cosh z and the local shape of | f (z)| near 
an ordinary point (iz /4), a zero (iz /2), and a saddle-point (0), with level lines shown 
on the surfaces. 


where we have set sf" (zo)/f (zo) = 4e'?, shows that the modulus satisfies 


FI = LF Go)l (1 + 2r? cos (20 + 4) + Or). 


Thus, starting at the direction 9 = —#/2 and turning around zo, the following se- 
quence of events regarding the modulus | f (z)| = | f (re!?)| is observed: it is maximal 
(9 = —¢/2), stationary (9 = —¢/2 + 7/4), minimal (0 = —¢/2 + 2/2), stationary, 
(0 = —$/2 + 3/4), maximal again (@ = —¢/2 + z), and so on. The pattern, sym- 
bolically “+ = — =”, repeats itself twice. This is superficially similar to an ordinary 
point, save for the important fact that changes are observed at twice the angular speed. 
Accordingly, the shape of the surface looks quite different; it is like the central part of 
a saddle. Two level curves cross at a right angle: one steepest descent line (away from 
the saddle-point) is perpendicular to another steepest descent line (towards the saddle- 
point). In a mountain landscape, this is thus much like a pass between two mountains. 
The two regions on each side corresponding to points with an altitude below a simple 
saddle-point are often referred to as “valleys”. 


VII. 1. LANDSCAPES OF ANALYTIC FUNCTIONS AND SADDLE-POINTS 545 


° 
= 
poititiartisiptiiiitiriitiiy 


es ms a 2 i 
AS -1.0 0.5 0.0 


Figure VIII.2. The “tripod”: two views of |1+z+ z? +23] as function of x = R(z), 
y = 3(z): (left) the modulus as a surface in R3; (right) the projection of level lines 
on the z-plane. 


Generally, a multiple saddle-point has multiplicity p if f (zo) 4 0 and all deriva- 
tives f’(zo),..., f (zo) are equal to zero while f+) (z9) # 0. In that case, the 
basic pattern “+ = — =” repeats itself p + 1 times. For instance, from a double 
saddle-point (p = 2), three roads go down to three different valleys separated by 
the flanks of three mountains. A double saddle-point is also called a “monkey sad- 
dle” since it can be visualized as a saddle having places for the legs and the tail: see 
Figure VIII.12 (p. 602) and Figure VIII. 14 (p. 605). 


Theorem VIII.1 (Classification of points on modulus surfaces). A surface | f (z)| at- 
tached to the modulus of a function analytic over an open set Q has points of only 
three possible types: (i) ordinary points, (ii) zeros, (fii) saddle-points. Under pro- 
jection on the complex plane, a simple saddle-point is locally the common apex of two 
curvilinear sectors with angle x /2, referred to as “valleys”, where the modulus of the 
function is smaller than at the saddle-point. 


As a consequence, the surface defined by the modulus of an analytic function has 
no maximum: this property is known as the Maximum Modulus Principle. It has no 
minimum either, apart from zeros. It is therefore a peakless landscape, in de Bruijn’s 
words [143]. Accordingly, for a meromorphic function, peaks are at oo and minima 
are at 0, the other points being either ordinary points or isolated saddle-points. 


Example VYII.1. The tripod: a cubic polynomial. An idea of the typical shape of the surface 
representing the modulus of an analytic function can be obtained by examining Figure VIII.2 
relative to the third degree polynomial f(z) = 1+z+ 2? +23. Since f@m=A- “y/( — 2), 
the zeros are at 

=i <i 


There are saddle-points at the zeros of the derivative f’(z) = 1+2z +3z?, that is, at the points 
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1 i 1 i 
= —-~4+-v2, ‘ye i 2, 
¢ 3. 3 ¢ 3. 3 
The diagram below summarizes the position of these “interesting” points: 


i (zero) 


e= -3 4a 5v2 (saddle-point) 


(3) + (0) 


c= -4 - 5/2 (saddle-point) 


—i (zero) 


The three zeros are especially noticeable on Figure VIII.2 (left), where they appear at the end 
of the three “legs”. The two saddle-points are visible on Figure VIII.2 (right) as intersection 
points: of: level Curvesy, ia vsa Pea A saa Sag aad VENAES CASA eee a ahaa ahs oak Webs aa eeeens | 


D> VII. The Fundamental Theorem of Algebra. This theorem asserts that a non-constant 
polynomial has at least one root, hence n roots if its degree is n (Note IV.38, p. 270). Let 
P(z) = 1+ a ,z+---anz” be a polynomial of degree n. Consider f(z) = 1/P(z). By basic 
analysis, one can take R sufficiently large, so that on |z| = R, one has | f(z)| < 5. Assume 
a contrario that P(z) has no zero. Then, f(z) which is analytic in |z| < R should attain its 
maximum at an interior point (since f (0) = 1), so that a contradiction has been reached. <] 


> VII.2. Saddle-points of polynomials and the convex hull of zeros. Let P be a polynomial 
and H the convex hull of its zeros. Then any root of P’(z) lies in H. (Proof: assume distinct 


zeros and consider 
P'(z) 1 
$(2) = = ys 


FQ) es p20 e 


If z lies outside H, then z “sees” all zeros a in a half-plane, this by elementary geometry. 
By projection on the normal to the half-plane boundary, it is found that, for some @, one has 


Re! A(z) < 0, so that P’(z) £0.) <J 


VIII. 2. Saddle-point bounds 


Saddle-point analysis is a general method suited to the estimation of integrals of 
analytic functions F(z), 


B 
(4) r= F(z) dz, 
A 


where F(z) = F,(z) involves some large parameter n. The method is instrumental 
when the integrand F is subject to rather violent variations, typically when there oc- 
curs in it some exponential or some fixed function raised to a large power n — +00. 
In this section, we discuss some of the global properties of saddle-point contours, 
then particularize the discussion to Cauchy coefficient integrals. General saddle-point 
bounds, which are easy to derive, result from simple geometric considerations (a pre- 
liminary discussion appears in Chapter IV, p. 246.). 
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Starting from the general form (4), we let C be a contour joining A and B and 
taken in a domain of the complex plane where F(z) is analytic. By standard inequali- 
ties, we have 
(5) [Z| < ICl|- sup |F@)I, 

zeC 
with ||C|| representing the length of C. This is the common trivial bound from integra- 
tion theory applied to a fixed contour C. 

For an analytic integrand F with A and B inside the domain of analyticity, there 
is an infinite class P of acceptable paths to choose from, all in the analyticity domain 
of F. Thus, by optimizing the bound (5), we may write 


(6) [Z| < inf tet-supirco 

CeP zeC 
where the infimum is taken over all paths C € P. Broadly speaking, a bound of this 
type is called a saddle-point bound'. 

The length factor ||C|| usually turns out to be unimportant for asymptotic bounding 
purposes—this is, for instance, the case when paths remain in finite regions of the 
complex plane. If there happens to be a path C from A to B such that no point is 
at an altitude higher than sup(|F(A)|, |F(B)|), then a simple bound results, namely, 
[Z| < ||Cl|-sup(|F (A)|, | (B))): this is in a sense the uninteresting case. The common 
situation, typical of Cauchy coefficient integrals of combinatorics, is that paths have to 
go at some higher altitude than the end points. A path C that traverses a saddle-point 
by connecting two points at a lower altitude on the surface |F (z)| and by following 
two steepest descent lines across the saddle-point is clearly a local minimum for the 
path functional 

©(C) = sup |F()I, 
zeC 
as neighbouring paths must possess a higher maximum. Such a path is called a saddle- 
point path or steepest descent path. Then, the search for a path minimizing 


inf | sup | F'(z)| 

C | ec 
(a simplification of (6) to its essential feature) naturally suggests considering saddle- 
points and saddle-point paths. This leads to the variant of (6), 


(7) Z| < ICol- sup |F(@)I, Co minimizes sup | F(z)|, 
zECo zeC 
also referred to as a saddle-point bound. 
We can summarize this stage of the discussion by a simple generic statement. 


Theorem VIII.2 (General saddle-point bounds). Let F(z) be a function analytic in 
a domain Q. Consider the class of integral uly F(z) dz where the contour y connects 


!Notice additionally that the optimization problem need not be solved exactly, as any approximate 
solution to (6) still furnishes a valid upper bound because of the universal character of the trivial bound (5). 
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two points A, B and is constrained to a class P of allowable paths in Q (e.g., those 
that encircle 0). Then one has the saddle-point bound?: 


t F(z)dz 
? 


If A and B lie in opposite valleys of a saddle-point zo, then the minimization problem 
is solved by saddle-point paths Co made of arcs connecting A to B through zo. In that 


case, one has 
B 
| : F(z)dz 
A 


Borrowing a metaphor of de Bruijn [143], the situation may be described as fol- 
lows. Estimating a path integral is like estimating the difference of altitude between 
two villages in a mountain range. If the two villages are in different valleys, the best 
strategy (this is what road networks often do) consists in following paths that cross 
boundaries between valleys at passes, i.e., through saddle-points. 

The statement of Theorem VIII.2 does no fix all details of the contour, when 
there are several saddle-points “separating” A and B—the problem is like finding the 
most economical route across a whole mountain range. But at least it suggests the 
construction of a composite contour made of connected arcs crossing saddle-points 
from valley to valley. Furthermore, in cases of combinatorial interest, some strong 
positivity is present and the selection of the suitable saddle-point contour is normally 
greatly simplified, as we explain next. 


< [Coll- sup |F@)I, 
zeCo 
where Co is any path that minimizes sup | F (z)|. 
zeC 


(8) 


< ICol - |F Gol, F'(zo) = 0. 


> VIIL3. An integral of powers. Consider the polynomial P(z) = 1+z+22 +23 of Exam- 
ple VIII.1. Define the line integral 
+i 
In = / P(z)" dz. 
-1 


On the segment connecting the end points, the maximum of | P(z)| is 0.63831, giving the weak 
trivial bound J, = O(0.63831”). In contrast, there is a saddle-point at ¢ = -} + 5V2 where 


IPO|= i resulting in the bound 


1 n 
Un| <A (;) , Av=le+1 4+) -—c] = 1.44141, 


as follows from adopting a contour made of two segments connecting —1 toi through ¢. Discuss 


further the bounds on f[ ie when (a, a’) ranges over all pairs of roots of P. dq 


Saddle-point bounds for Cauchy coefficient integrals. Saddle-point bounds can 
be applied to Cauchy coefficient integrals, 


1 d 
(9) 8n = [z"|G@) = xf G(z) aaa 


2The form given by (8) is in principle weaker than the form (6), since it does not take into account the 
length of the contour itself, but the difference is immaterial in all our asymptotic problems. 
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for which we can avail ourselves of the previous discussion, with F(z) = G(z)z7"—!. 
In (9) the symbol ¢ indicates that the allowable paths are constrained to encircle 
the origin (the domain of definition of the integrand is a subset of C \ {0}; the points 
A, B can then be seen as coinciding and taken somewhere along the negative real line; 
equivalently, one may take A = —ae'* and B = —ae~‘*, fora > O and € > 0). 

In the particular case where G(z) is a function with non-negative coefficients, a 
simple condition guarantees the existence of a saddle-point on the positive real axis. 
Indeed, assume that G(z), which has radius of convergence R with O < R < +00, 
satisfies G(z) — +o0asz— R™ along the real axis and G(z) not a polynomial. Then 
the integrand F(z) = G(z)z~"7! satisfies F(0+) = F(R~) = +00. This means that 
there exists at least one local minimum of F over (0, R), hence, at least one value 
¢ € (0, R) where the derivative F’ vanishes. (Actually, there can be only one such 
point; see Note VIIL4, p. 550.) Since ¢ corresponds to a local minimum of F’, we have 
additionally F”(¢) > 0, so that the saddle-point is crossed transversally by a circle 
of radius ¢. Thus, the saddle-point bound, specialized to circles centred at the origin, 
yields the following corollary. 


Corollary VIII.1 (Saddle-point bounds for generating functions). Let G(z), not a 
polynomial, be analytic at 0 with non-negative coefficients and radius of convergence 
R < +00. Assume that G(R~) = +00. Then one has 


G G’ 
(10) [z"]G(z) < Ey. with ¢ € (0, R) the unique root of € eo) =n+l. 
ae G©) 
Proof. The saddle-point is the point where the derivative of the integrand is 0. There- 
fore, we consider (G(z)z~"~!)’ = 0, or G/(z)z7"7! — (n + 1)G(z)z7"~? = 0, or 
G'(z) 
vé nt 
G(z) 
We refer to this as the saddle-point equation and use ¢ to denote its positive root. The 
perimeter of the circle is 27 ¢, so that the inequality [z”]G(z) < G(©)/c” follows. Hf 


n+1 


Corollary VIII.1 is equivalent to Proposition IV.1, p. 246, on which it sheds a new 
light, while paving the way to the full saddle-point method to be developed in the next 
section. 


We examine below two particular cases related to the central binomial and the 
inverse factorial. The corresponding landscapes of Figure VII.3, which bear a sur- 
prising resemblance to one another, are, by the previous discussion, instances of a 
general pattern for functions with non-negative coefficients. It is seen on these two 
examples that the saddle-point bounds already catch the proper exponential growths, 
being off only by a factor of O(n7!/?). 

Example VMII.2. Saddle-point bounds for central binomials and inverse factorials. Consider 
the two contour integrals around the origin 

1 dz 1 dz 
(11) n= Pl e™ ntl’ is roa 
Pa 


whose values are otherwise known, by virtue of Cauchy’s coefficient formula, to be Jy, = ( FA 
and Ky = 1/n!. In that case, one can think of the end points A and B as coinciding and taken 
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Figure VIII.3. The modulus of the integrands of J, (central binomials) and Ky, (in- 
verse factorials) for n = 5 and the corresponding saddle-point contours. 


somewhat arbitrarily on the negative real axis, while the contour has to encircle the origin once 
and counter-clockwise. 


The saddle-point equations are, respectively, 
2. 1 1 
noon + 0 1 n+l 0 


> > 


1+z Zz Zz 
F : : n+1 i . ; 
the corresponding saddle-points being ¢ = and ¢’ = n+ 1. This provides the upper 
bounds 
(12) ae 2n . 4n2 \" 2 16 in K,= 1 7 ett 
Nn) -\n2-1) — 9’ "al (af$1yr’ 
which are valid for all values n > 2. 0.0... ccc cee ence te eee enn n eee || 


> VIOI.4. Upward convexity of G(x)x—". For G(z) having non-negative coefficients at the 


origin, the quantity G(x)x—” is upward convex for x > 0, so that the saddle-point equation for 
¢ can have at most one root. Indeed, the second derivative 


d? G(x) _ x?G"(x) — 2nxG! (x) +n F IG(a) 
dx2 x" xnt2 ; 
is positive for x > 0 since its numerator, 
Dit 1-Wn—byeexk, ge = 11), 
k>0 
has only non-negative coefficients. (See Note IV.46, p. 280, for an alternative derivation.) < 


> VIII.5. A minor optimization. The bounds of Equation (6), p. 547, which take the length of 
the contour into account, lead to estimates that closely resemble (10). Indeed, we have 


om ! 
[z”]G(z) < oS. € root of z a =n, 


(13) 
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when optimization is carried out over circles centred at the origin. dq 


VIII. 3. Overview of the saddle-point method 


Given a complex integral with a contour traversing a single saddle-point, the 
saddle-point corresponds locally to a maximum of the integrand along the path. It 
is then natural to expect that a small neighbourhood of the saddle-point may provide 
the dominant contribution to the integral. The saddle-point method is applicable pre- 
cisely when this is the case and when this dominant contribution can be estimated by 
means of local expansions. The method then constitutes the complex analytic coun- 
terpart of the method of Laplace (Appendix B.6: Laplace’s method, p. 755) for the 
evaluation of real integrals depending on a large parameter, and we can regard it as 
being 

Saddle-point method = Choice of contour + Laplace’s method. 


Similar to its real-variable counterpart, the saddle-point method is a general strategy 
rather than a completely deterministic algorithm, since many choices are left open in 
the implementation of the method concerning details of the contour and choices of its 
splitting into pieces. 


To proceed, it is convenient to set F(z) = ef© and consider 


B 
(14) r= | ef ® dz, 
A 


where f(z) = fn(z), as F(z) = F,(z) in the previous section, involves some large 
parameter n. Following possibly some preparation based on Cauchy’s theorem, we 
may assume that the contour C connects two end points A and B lying in opposite 
valleys of the saddle-point ¢. The saddle-point equation is F’(¢) = 0, or equivalently 
since F = e/: 

f') =0. 

The saddle-point method, of which a summary is given in Figure VIII.4, is based 
on a fundamental splitting of the integration contour. We decompose C = C® UC), 
where C®) called the “central part” contains ¢ (or passes very near to it) and C (ly 
is formed of the two remaining “tails”. This splitting has to be determined in each 
case in accordance with the growth of the integrand. The basic principle rests on two 
major conditions: the contributions of the two tails should be asymptotically negligible 
(condition SP); in the central region, the quantity f(z) in the integrand should be 
asymptotically well approximated by a quadratic function (condition SP2). Under 
these conditions, the integral is asymptotically equivalent to an incomplete Gaussian 
integral. It then suffices to verify—this is condition SP3, usually a minor a posteriori 
technical verification—that tails can be completed back, introducing only negligible 
error terms. By this sequence of steps, the original integral is asymptotically reduced 
to a complete Gaussian integral, which evaluates in closed form. 


Specifically, the three steps of the saddle-point method involve checking condi- 
tions expressed by Equations (15), (16), and (18) below. 
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B 
Goal: Estimate | F(z) dz, setting F = ef; here, F = Fy and ft = fn depend on a large 
A 


parameter n. 

— The end points A, B are assumed to lie in opposite valleys of the saddle-point. 

— A contour C through (or near) a simple saddle-point ¢, so that f’(¢) = 0, has been chosen. 
— The contour is split as C = cH uc), 

The following conditions are to be verified. 


SP : Tails pruning. On the contour C(), the tails integral Jew is negligible: 


di F(z)dz=o0 ([, Fas). 


SP: Central approximation. Along C Oa quadratic expansion, 


1 
F@=fO+5f"OR- CY + Om), 


is valid, with y, — 0 as n > ov, uniformly with respect to z € c®), 

SP3: Tails completion. The incomplete Gaussian integral resulting from SP2, taken over the 
central range, is asymptotically equivalent to a complete Gaussian integral (with f”(¢) = 
e'9| f"(©)| and ¢ = +1 depending on orientation): 


[ . es f"OR-CY gy ~ pie ib/2 i °° Eo lP"O?/2 dy = sie ib/2 |_2# 
Cc 


—oo IF"O)| § 
Result: Assuming SP1, SPz, and SP3, one has, with ¢ = +1 and arg(f”(¢)) = ¢: 
B : 
ars ef dz ~ geib/2 ef) a¥ ef ©) 
2ix Ja VJ2alf"(O)| VJ2nf"O) 


Figure VIII.4. A summary of the basic saddle-point method. 


SP,: Tails pruning. On the contour C“), the tail integral Jew is negligible: 


(15) [. Fed: =o( [ Fed). 


This condition is usually established by proving that F(z) remains small enough (e.g., 
exponentially small in the scale of the problem) away from ¢, for z € C Oy. 
SP: Central approximation. Along C©), a quadratic expansion, 


1 
(16) FQ) =fOF 5f"OE =F + Om), 


is valid, with 7, — 0 as n — oo, uniformly for z € C ©). This guarantees that A e/ is 
well-approximated by an incomplete Gaussian integral: 


(17) i ef @ dz~ ef © [ eit’ O@-CP dz. 
C(0) C0) 
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SP3: Tails completion. The tails can be completed back, at the expense of asymp- 
totically negligible terms, meaning that the incomplete Gaussian integral is asymptot- 
ically equivalent to a complete one (itself given by (12), p. 744), 


L py 2 : 7 Wey Ly2 2a 
(18) i e2f OY dz ~ pie taf e PF OR?? dy = gie!?/? ; 
C0) ~o0 IP"O| 
where ¢ = +1 is determined by the orientation of the original contour C, and f”(¢) = 
al is (©)|. This last step deserves a word of explanation. Along a steepest descent 
curve across ¢, the quantity f”(¢)(z — ¢) is real and negative, as we saw when dis- 
cussing saddle-point landscapes (p. 543). Indeed, with f’(¢) = e'?| (©), one has 
arg(z—¢) = —¢/2+4 (mod z). Thus, the change of variables x = +i(z apyette 
reduces the left side of (18) to an integral taken along (or close to) the real line*. The 
condition (18) then demands that this integral can be completed to a complete Gauss- 
ian integral, which itself evaluates in closed form. 
If these conditions are granted, one has the chain 


i Eh ees | aaa | el" OC go w tie /2ef ©) 2 
C fa) CO If" Ol 


by virtue of Equations (15), (17), (18). In summary: 


Theorem VIII.3 (Saddle-point Algorithm). Consider an integral te F(z) dz, where 
the integrand F = eS is an analytic function depending on a large parameter and 
A, B lie in opposite valleys across a saddle-point ¢, which is a root of the saddle- 
point equation 
f'C)=0 
(or, equivalently, F'(¢) = 0). Assume that the contour C connecting A to B can be 
split into C = C®) UC in such a way that the following conditions are satisfied: 
(i) tails are negligible, in the sense of Equation (15) of SP1, 
(ii) a central approximation hold, in the sense of Equation (16) of SP2, 
(iii) tails can be completed back, in the sense of Equation (18) of SP3. 


Then one has, with ¢ = +1 reflecting orientation and ¢ = arg(f” (C)): 


B fO) f©) 
(19) = / ef @ dz ~ ge7i/? = ea nen : 
ain Ja V2alf"(0)I V2af"(O) 
It can be verified at once that a blind application of the formula to the two integrals 
of Example VIII.2 produces the expected asymptotic estimates 


(20) ] 2n 4” d x 1 1 
= ~ an =—~ , 
n Jan "nl nte—n/2an 
The complete justification in the case of Ky, is given in Example VIII.3 below. The 
case of J, is covered by the general theory of “large powers” of Section VIII. 8, p. 585. 


3The sign in (18) is naturally well-defined, once the data A, B, and f are fixed: one possibility is to 
adopt the determination of ¢/2 (mod z) such that A and B are sent close to the negative and the positive 
real axis, respectively, after the final change of variables x = i(z — ¢ Je id/ 2 
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In order for the saddle-point method to work, conflicting requirements regard- 
ing the dimensioning of C© and C must be satisfied. The tails pruning and tails 
completion conditions, SP; and SP3, force C ©) to be chosen large enough, so as to 
capture the main contribution to the integral; the central approximation condition SP2 
requires C© to be small enough, to the effect that f(z) can be suitably reduced to its 
quadratic expansion. Usually, one has to take ||C (0) |/|Cl| — 0, and the following ob- 
servation may help make the right choices. The error in the two-term expansion being 
likely given by the next term, which involves a third derivative, it is a good guess to 
dimension C) to be of length 6 = 6(n) chosen in such a way that 


(21) (OP 20, f"OeRr > 0, 


so that both tail and central approximation conditions can be satisfied. We call this 
choice the saddle-point dimensioning heuristic. 

On another register, it often proves convenient to adopt integration paths that 
come close enough to the saddle-point but need not pass exactly through it. In the same 
vein, a steepest descent curve may be followed only approximately. Such choices 
will still lead to valid conclusions, as long as the conditions of Theorem VIII.3 are 
verified. (Note carefully that these conditions neither impose that the contour should 
pass strictly through the saddle-point, nor that a steepest descent curve should be 
exactly followed.) 


Saddle-point method for Cauchy coefficient integrals. For the purposes of an- 
alytic combinatorics, the general saddle-point method specializes. We are given a 
generating function G(z), assumed to be analytic at the origin and with non-negative 
coefficients, and seek an asymptotic form of the coefficients, given in integral form by 


1 d 
KIG@) = = | oe 5. 


There, C encircles the origin, lies within the domain where G is analytic, and is posi- 
tively oriented. This is a particular case of the general integral (14) considered earlier, 
with the integrand being F(z) = G(z)/z"*!. 

The geometry of the problem is now simple, and, for reasons seen in the previous 
section, it suffices to consider as integration contour a circle centred at the origin and 
passing through (or very near) a saddle-point present on the positive real line. It is 
then natural to make use of polar coordinates and set 


z=re’, 


where the radius r of the circle will be chosen equal to (or close to) the positive saddle- 
point value. We thus need to estimate 


(22) ["]G = $ Ge ee a ie Glrei)e-"® a9 
2"1G@) = 5 Da pe fn Oe Oe 


Under the circumstances, the basic split of the contour C = C UC™ involves a 
central part C, which is an arc of the circle of radius r determined by || < 9 for 
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some suitably chosen @. On C, a quadratic approximation should hold, according 
to SP2 [central approximation]. Set 


(23) f(@) := log G(z) — nlogz. 

A natural possibility is to adopt for r the value that cancels f’(r), 
G’ 

(24) pe ties 
G(r) 


which is a version of the saddle-point equation’ relative to polar coordinates. This 
grants us locally, a quadratic approximation without linear terms, with f(r) a com- 
putable quantity (in terms of f(r), f’(r), f’(r)), we have 


25) fre) — Fe) = ~5 POP? + 000°), 


which is valid at least for fixed r (i.e., for fixed n), as 0 > 0 

The cutoff angle 9 is to be chosen as a function of n (or, equivalently, r) in accor- 
dance with the saddle-point heuristic (21). It then suffices to carry out a verification of 
the validity of the three conditions of the saddle-point method, SP1, SP2 (for which a 
suitably uniform version of (25) needs to be developed), and SP3 of Theorem VIIL.3, 
p. 553, adjusted to take into account polar coordinate notations. 


The example below details the main steps of the saddle-point analysis of the gen- 
erating function of inverse factorials, based on the foregoing principles. 


Example V1.3. Saddle-point analysis of the exponential and the inverse factorial I. The goal 
is to estimate 4 = [z”]e*, the starting point being 
1 d 
Kn = Aa ee ne ’ 
2inx Idl=r itl 

where integration will be performed along a circle centred at the origin. The landscape of the 
modulus of the integrand has been already displayed in Figure VIII.3, p. 550—there is a saddle- 
point of G(z)z~"—! at ¢ =n+1 with an axis perpendicular to the real line. We thus expect an 
asymptotic estimate to derive from adopting a circle passing through the saddle-point, or about. 

We switch to polar coordinates, fix the choice of the radius r = n in accordance with (24), 
and set z = ne!?. The original integral becomes, in polar coordinates, 


n 1 +1 


n” Qn Jz 


fan) 


io 
(26) Ki = elle —1-i0) do, 
where, for readability, we have taken out the factor G(r)/r” = e"/n”. Set h(Q) = e!? —1-i0. 
The function je?) = eS9-1 is unimodal with its peak at 9 = 0 and the same property 
holds for Je”), representing the modulus of the integrand in (26), which gets more and more 


strongly peaked at 9 = 0, as n > +00; see Figure VIIL5. 


4Equation (24) is almost the same as ¢G’(¢)/G(¢) =n + 1 of (10), which defines the saddle-point in 
z-coordinates. The (minor) difference is accounted for by the fact that saddle-points are sensitive to changes 
of variables in integrals. In practice, it proves workable to integrate along a circle of radius either r or ¢, or 
even a suitably close approximation of r, ¢, the choice being often suggested by computational convenience. 
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HLLEEN Y 
NINN 


Figure VIIL5S. Plots of |e%z~"—!| for n = 3 and n = 30 (scaled according to the 
value of the saddle-point) illustrate the essential concentration condition as higher 
values of n produce steeper saddle-point paths. 


In agreement with the saddle-point strategy, the estimation of Kn proceeds by isolating a 
small portion of the contour, corresponding to z near the real axis. We thus introduce 


KO = We nap, KD = ne enh) ao, 
—4% % 

and choose Op in accordance with the general heuristic of (21), which corresponds to the two 

conditions: noe — oo (informally: 4) > n—!/2) and no3 — 0, (informally: 69 « n—"/3), 

One way of realizing the compromise is to adopt 09 = n“, where a is any number between 

—1/2 and —1/3. To be specific, we fix a = —2/5, so 


(27) 6 = O(n) =1n2/9, 


In particular, the angle of the central region tends to zero. 


0 ncos0 


(i) Tails pruning. For z = ne! one has |e*| =e , and, by unimodality properties of 


the cosine, the tail integral K () satisfies 
(28) Kn? | = 0 (e-meos-) = 0 (exp (-cnl’)), 
for some C > 0. The tail integral is thus is exponentially small. 

(ii) Central approximation. Near 6 = 0, one has h(@) = 9 _1-jo= —}0" + 0(63), 
so that, for |O| < 4, 


Since 09 = n—2/5, we have 

(0) ere 67/2 1/5 

-_ —n _ 
(29) R= i é do (1 + O(n )) ; 
which, by the change of variables t = 6./n, becomes 
4nl/10 
@) _ 1 -17/2 -1/5 
Kk,’ =—= dt {1 : 

(30) =e lm e (1+ 007") 


The central integral is thus asymptotic to an incomplete Gaussian integral. 
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(iii) Tails completion. Given (30), the task is now easy. We have, elementarily, for c > 0, 


+oo 
G1) | aad ees) (e-°?) : 
2 


which expresses the exponential smallness of Gaussian tails. As a consequence, 


o . 1 a =P /2 4, [2% 
2, Ka Se = dt =,/—. 
(32) wwe pe Pda 


Assembling (28) and (32), we obtain 


0), ,a (2% ae (x (0) ») ~ e” 
Ky,’ + Ky me) ie., Ky = an nn + K doa 


The proof also provides a relative error term of O(n—!/>), Stirling’s formula is thus seen to be 
(inter alia!) a consequence of the saddle-point method. .............. 00. eee e cece eee eee | 


Complete asymptotic expansions. Just like Laplace’s method, the saddle-point 
method can often be made to provide complete asymptotic expansions. The idea is 
still to localize the main contribution in the central region, but now take into account 
corrections terms to the quadratic approximation. As an illustration of these general 
principles, we make explicit here the calculations relative to the inverse factorial. 


Example VYII.4. Saddle-point analysis of the exponential and the inverse factorial II. For a 
complete expansion of [z”]e*, we only need to revisit the estimation of K ©) in the previous 
example, since K () is exponentially small anyhow. One first rewrites 


A 
KO = es 710? (2 n(cos-1+50") ag 


a 
= =| IM 0? /2gnste/ dw,  &(@):=cos0 — 1+ 162. 

Wiel See 2 
The calculation proceeds exactly in the same way as for the Laplace method (Appendix B.6: 
Laplace’s method, p. 755). It suffices to expand h(@) to any fixed order, which is legitimate in 
the central region. In this way, a representation of the form, 


KO - tye ewe [py > Ex(w) 1+ 03M a: 
7 “boJm 27) nM/2 


is obtained, where the E(w) are computable polynomials of degree 3k. Distributing the inte- 
gral operator over terms in the asymptotic expansion and completing the tails yields an expan- 
sion of the form 


where dg = V2a and dj := fee ew" / 2E,(w) dw. All odd terms disappear by parity. The 
net result is then the following. 


Proposition VIII.1 (Stirling’s formula). The factorial numbers satisfy 


1 etn” 1 1 139 571 
~ 1- + + - aces 
n! EN 12n 288n2 51840n3 =. 2488320 n4 
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Notice the amazing similarity with the form obtained directly for n! in Appendix B.6: 
Laplace's Method, pe TOs: i658 08 se Sb he wee BEES SHES SER EAE Sa aA SE Dele oS Gabe a | 


> VIIL6. A factorial surprise. Why is it that the expansion of n! and 1/n! involve the same set 
of coefficients, up to sign? dq 


VIII. 4. Three combinatorial examples 


The saddle-point method permits us to solve a number of asymptotic problems 
coming from analytic combinatorics. In this section, we illustrate its use by treating 
in some detail three combinatorial examples’: 


Involutions (ZL), Set partitions (S), Fragmented permutations (F). 


These are all labelled structures introduced in Chapter Il. Their specifications and 
EGFs are 


Involutions : Z = SET(SET) 2(Z)) = I(@)= ger 
(33) Set Partition : S = SET(SETs|(Z)) = S(z)=e*7! 
Fragmented perms: F = SET(SEQs;(Z)) = F(z)= ee/(U-2) | 


The first two are entire functions (i.e., they only have a singularity at oo), while the 
last one has a singularity at z = 1. Each of these functions exhibits a fairly vio- 
lent growth—of an exponential type—near its positive singularity, at either a finite or 
infinite distance. As the reader will have noticed, all three combinatorial types are 
structurally characterized by a set construction applied to some simpler structure. 

Each example is treated, starting from the easier saddle-point bounds and pro- 
ceeding with the saddle-point method. The example of involutions deals with a prob- 
lem that is only a little more complicated than inverse factorials. The case of set 
partitions (Bell numbers) illustrates the need in general of a good asymptotic tech- 
nology for implicitly defined saddle-points. Finally, fragmented permutations, with 
their singularity at a finite distance, pave the way for the (harder) analysis of integer 
partitions in Section VIII.6. We recapitulate the main features of the saddle-point 
analyses of these three structures, together with the case of inverse factorials (urns), 
in Figure VIII.6. 


Example VIIL5. Involutions. An involution is a permutation t such that r2 is the identity 


permutation (p. 122). The corresponding EGF is /(z) = erte/ 2. We have in the notation 
of (23) 
2 


Zz 
fWM=z2+ 7 —nlogz, 


and the saddle-point equation in polar coordinates is 


1 1 eee | 
r(di+r)=n, implying r=—5t5V4nt whats toa). 


The purpose of these examples is to become further familiarized with the practice of the saddle-point 
method in analytic combinatorics. The impatient reader can jump directly to the next section, where she 
will find a general theory that covers these and many more cases. 
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Class EGF radius (r) angle (89)|_ coeff [z"] in EGF 
urns 

Ny—n 
SET(Z) e& n n—2/5 pas 


V2mNn 


(Ex. VIIL3, p. 555) 
involutions 


nf2—-1/4—n/2 
ser(Cyc;9(Z)) et/2|  ~va—4} n-2/5 | x eg 
wn 


(Ex. VIIL5, p. 558) 
set partitions 


re 
ef 1 


r”/2ar(r + Ler 


SET(SETs{(Z))  e@ ~!_ |~ logn—loglogn e77!/9/r | ~ 


(Ex. VIIL6, p. 560) 
fragmented perms 


—1/242J/n 
z/(1-2) Syed. —7/10 en 
SET(SEQs1(Z))  e 1 Rie n 2 Jensi4 


(Ex. VIIL7, p. 562) 


Figure VIII.6. A summary of some major saddle-point analyses in combinatorics. 


The use of the saddle-point bound then gives mechanically 


n/2+/n = 
(34) “ < Serra. +0(1)), In < eA 2ane 8/247 9/2] 4 o(1)). 


(Notice that if we use instead the approximate saddle-point value, ./n, we only lose a factor 
e—!/4 = 0.77880.) 

The cutoff point between the central and non-central regions is determined, in agree- 
ment with (21), by the fact that the length 6 of the contour (in z coordinates) should satisfy 
f'(r)62 > co and f’(r)o? > 0. In terms of angles, this means that we should choose a 
cutoff angle Op that satisfies 


r f"(r)05 >, Pf" (rh > 0. 
Here, we have f”(r) = O(1) and f(r) = O(n7!/2). Thus, 4 must be of an order some- 
1/3, and we fix 


fy = 0-2/5, 


where in between n~!/2 and n~ 


(i) Tails pruning. First, some general considerations are to be made, regarding the be- 
haviour of |/(z)| along large circles, z = re’?, One has 


log \T(re!?)| =rcosd+ a cos 20. 
As a function of @, this function decreases on (0, 5)» since it is the sum of two decreasing 
functions. Thus, |/ (z)| attains its maximum (er +r /2) at r and its minimum (e7?"/2) atz=ri. 
In the left half-plane, first for 0 € (5, ar), the modulus |/ (z)| is at most e” since cos 26 < 0. 
Finally, for 0 € es z) smallness is granted by the fact that cos@ < —1/,/2 resulting in the 
bound |/(z)| < e” */2-r/V/2_ The same argument applies to the lower half plane 3(z) < 0. 
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As a consequence of these bounds, / (z)/I(./7) is strongly peaked at z = r; in particular, it is 
exponentially small away from the positive real axis, in the sense that 


I(re!?) ee (c= 


om T(r) 7) 


) = O (exp(—n")), 0 ¢[—4, 4], 


for some a > 0. 


(ii) Central approximation. We then proceed and consider the central integral 
ef 1) p+% 6 
2 / exp (f(rei”) — f)) a0. 

—09 


2a 
What is required is a Taylor expansion with remainder near the point r ~ ./n. In the central 
region, the relations f/(r) =0 f(r) =2+ O(1/n), and f’”(z) = O(n—!/2) yield 


J? = 


2 
fre) — fF) = Soe” a1) 0 (n-1/?7395) = -776? + o(n7"/). 


This is enough to guarantee that 


f(r) +0 
(36) ga | yer a6 (1 a o(n-"/5)) 
2x JO 
(iii) Tails completion. Since r ~ ./n and 0) = n~2/>, we have 


+9 1 f+9or l +00 
(37) / gare doi = - | a oe (/ Paar eee) (-"")) 
—0 r J—Oor r —oo 


Finally, Equations (35), (36), and (37) give: 
Proposition VIII.2. The number In of involutions satisfies 


| OR dae PY OT 1 
(38) a re e (: +0 (=)) : 


Comparing the saddle-point bound (34) to the true asymptotic form (38), we see that the 
former is only off by a factor of o(n!/ 2. Here is a table further comparing the asymptotic 
estimate /* provided by the right side of (38) to the exact value of In: 

n| 10 100 1000 


In | 9496 2.40533 - 1082 2.14392 - 101296 
IX | 8839 2.34149 - 1082 2.12473 - 101296. 


The relative error is empirically close to 0.3/./n, a fact that could be proved by developing a 
complete asymptotic expansion along the lines expounded in the previous section, p. 557. 

The estimate (38) of J, is given by Knuth in [378]: his derivation is carried out by means 
of the Laplace method applied to the explicit binomial sum that expresses J,. Our complex 
analytic derivation follows Moser and Wyman’s in [448]. 0.0... eee ee eee eee | 


Example VYII.6. Set partitions and Bell numbers. The number of partitions of a set of n 
elements defines the Bell number Sy (p. 109) and one has 

Sn =nte—![z"]G(z) where = G(z) = &. 
The saddle-point equation relative to G(z)z-"—! (in z-coordinates) is 


ce =n. 
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This famous equation admits an asymptotic solution obtained by iteration (or “bootstrapping’”’): 
it suffices to write ¢ = log(n + 1) — log ¢, and iterate (say, starting from ¢ = 1), which provides 
the solution, 


log! log” | 
(39) C =C(n) = logn — log logn + 28" 4 0 a 
logn log n 
(see [143, p. 26] for a detailed discussion). The corresponding saddle-point bound reads 


eal 


Sn <n! 


The approximate solution € = logn yields in particular the simplified upper bound 


etl 
! 


‘(log ny" 


which is enough to check that there are much fewer set partitions than permutations, the ratio 
—nloglogn+O(n)_ 


Sy <n 


being bounded from above by a quantity e 
In order to implement the saddle-point strategy, integration will be carried out over a circle 
of radius r = ¢. We then set 


G 
f(z) = log (SS) =e —(n+ 1)logz, 


and proceed to estimate the integral, 


dz 


Jn = =—— | GQ) — , 
"Vin C @) itl 


along the circle C of radius r. The usual saddle-point heuristic suggests that the range of the 
saddle-point is determined by a quantity 99 = 9o(n) such that the quadratic terms in the ex- 
pansion of f at r tend to infinity, while the cubic terms tend to zero. In order to carry out the 
calculations, it is convenient to express all quantities in terms of r alone, which is possible since 
n can be disposed of by means of the relation n + 1 = re”. We find: 


f'Na=eter, fH =e (1 -2r?). 


Thus, 0 should be chosen such that 7 e08 > 0, Pe — 0, and the choice rO) = e 
is suitable. 


—2r/5 


(i) Tails pruning. First, observe that the function G(z) is strongly concentrated near the 
real axis since, with z = rel, there holds 


z 1 cos@ 
e < ef 


(40) lIa(Ser",.. «le 


In particular G(re!?) is exponentially smaller than G(r) for any fixed 0 4 0, when r gets large. 


(ii) Central approximation. One then considers the central contribution, 


(0) . 1 i dz 
Se G(z) —, 
" 2ix Je) @) n+l 


where C) is the part of the circle z = re? such that JA. < = e2"/97—1. Since on CO, 
the third derivative is uniformly O(e”), one has there 


f(re!”) = fr) —- 5 sf") + 0(° 67 e"). 


This approximation can then be transported into the integral iO, 
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(iii) Tails completion. Tails can be completed in the usual way. The net effect is the 
estimate 


f(r) 
a (i+0(Pe)). 


which, upon making the error term explicit rephrases, as follows. 


[2"]G(z) = 


Proposition VIII.3. The number Sn of set partitions of size n satisfies 
eal /5 

4l Sy =n! _____ (1+ 07" ). 

@) ‘ r”./2ar(r + le” ( ) 


where r is defined implicitly by re’ =n + 1, so thatr = logn — loglogn + o(]). 
Here is a numerical table of the exact values S,, compared to the main term Sx of the 
approximation (41): 


n 10 100 1000 


S, | 115975 4.75853- 10115 2.98990 - 101927 
S* | 114204 4.75537-10!!5 2.99012 - 101927 


The error is about 1.5% for n = 10, less than 1073 and 10~4 for n = 100 and n = 1000. 

The asymptotic form in terms of r itself is the proper one as no back substitution of an 
asymptotic expansion of r (in terms of n and log n) can provide an asymptotic expansion for Sy 
solely in terms of n. Regarding explicit representations in terms of n, it is only log S, that can 
be expanded as 


1 log logn 1 log logn 2, 
— log Sy = logn — log logn — 1 + ——— + — + Of] {| —— : 
n logn logn logn 


(Saddle-point estimates of coefficient integrals often involve such implicitly defined quantities.) 

This example probably constitutes the most famous application of saddle-point techniques 
to combinatorial enumeration. The first correct treatment by means of the saddle-point method 
is due to Moser and Wyman [447]. It is used for instance by de Bruijn in [143, pp. 104-108] as 
aleadiéxample:of the-tinethods: iret ais els oe Shee ag Ses ocade Slt Sle ag tlle eaghat fs | 


Example VYIL7. Fragmented permutations. These correspond to F(z) = exp(z/(1 — z)). 
The example now illustrates the case of a singularity at a finite distance. We set as usual 


f(@) = as — HF Vlogs, 


Zz 
and start with saddle-point bounds. The saddle-point equation is 
¢ 
(42) ———  =n+t+l, 
Ce) 


so that ¢ comes close to the singularity at 1 as n gets large: 


2) 3-— V4 Bs) 1 1 
a a eS et EEN Ge 
2n+2 Jn 2n 


Here, the approximation C(n) = 1-1/.,/n, leads to 
(43) [2 |F(z) < e7!/2e2V" (1 + o(1)). 


The saddle-point method is then applied with integration along a circle of radius r = ¢. 
The saddle-point heuristic suggests to restrict the integral to a small sector of angle 26, and, 
since f(r) = O(n3/?) while f(r) = O(n), this means taking 0 such that n3/40g 00 
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and n2/3@  — 0. For instance, the choice 0) = n~7/10 


verified: we have 
e-a)| . = e-exp( 1—rcosd i 
z=rel? 1 —2rcos6 + r2 


which is a unimodal function of 0 for 9 € (—z, z). (The maximum of this function of 6 is of 
order exp((1 — ry) and is attained at 0 = 0; the minimum is O(1), attained at 9 = z.) In 
particular, along the non-central part |O| > 9p of the saddle-point circle, one has 


eee acs = O(exp (va - ni/10) 


so that tails are exponentially small. Local expansions then enable us to justify the use of the 
general saddle-point formula in this case. The net result is the following. 


is suitable. Concentration is easily 


(44) 


Proposition VIIL.4. The number of fragmented permutations, Fy = n\[z"|F (z), satisfies 
Fn e7!/2¢2/n 
n! 2./mn3/4 * 


Quite characteristically, the corresponding saddle-point bound (43) turns out to be off the 
asymptotic estimate (45) only by a factor of order n>/4_ The relative error of the approxima- 
tion (45) is about 4%, 1%, 0.3% for n = 10, 100, 1000, respectively. 

The expansion above has been extended by E. Maitland Wright [618, 619] to several 
classes of functions with a singularity whose type is an exponential of a function of the form 
(1 — z)-; see Note VIII.7. (For the case of (45), Wright [618] refers to an earlier article of 
Perron published in 1914.) His interest was due, at least partly, to applications to generalized 
partition asymptotics, of which the basic cases are discussed in Section VIII. 6, p. 574. .... 


(45) 


> VII.7. Wright’s expansions. Consider the function 


-p A 
F(z) =(1—2z) "exp aor)? A>0O, p>0. 


Then, a saddle-point analysis yields, when p < 1: 


p-\—p/2 p oat 
[FQ ~ N exp (A(p + 1)N M, sels (=) p+ 
J 2a Ap(p + 1) Ap 


(The case p > 1 involves more terms of the asymptotic expansion of the saddle-point.) The 
method generalizes to analytic and logarithmic multipliers, as well as to a sum of terms of the 
form A(1 — z)~? inside the exponential. See [618, 619] for details. J 


> VIIL.8. Some oscillating coefficients. Define the function 


s(z) =sin (=). 


The coefficients sy, = [z’]s(z) are seen to change sign at n = 6, 21, 46, 81, 125, 180,.... Do 
signs change infinitely many times? (Hint: Yes. there are two complex conjugate saddle-points 
and the associated asymptotic forms combine a growth of the type n@ eV" with an oscillating 
factor similar to sin ./n.) The sum 


exhibits similar fluctuations. <i 
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VIII. 5. Admissibility 


The saddle-point method is a versatile approach to the analysis of coefficients 
of fast-growing generating functions, but one which is often cumbersome to apply 
step-by-step. Fortunately, it proves possible to encapsulate the conditions repeatedly 
encountered in our previous examples into a general framework. This leads to the 
notion of an admissible function presented in Subsection VIII. 5.1. By design, saddle- 
point analysis applies to such functions and asymptotic forms for their coefficients 
can be systematically determined: this follows an approach initiated by Hayman in 
1956. A great merit of abstraction in this context is that admissible functions satisfy 
useful closure properties, so that an infinite class of admissible functions of relevance 
to combinatorial applications can be determined—we develop this theme in Subsec- 
tion VII. 5.2, relative to enumeration. Finally, Subsection VIII.5.3 presents an ap- 
proach to the probabilistic problem known as depoissonization, which is much akin to 
admissibility. 


VIII. 5.1. Admissibility theory. The notion of admissibility is in essence an ax- 
iomatization of the conditions underlying Theorem VIII.3 particularized to the case 
of Cauchy coefficient integrals. In this section, we base our discussion on H—admis- 
sibility, the prefix H being a token of Hayman’s original contribution [325]. A crisp 
account of the theory is given in Section I.7 of Wong’s book [614] and in Odlyzko’s 
authoritative survey [461, Sec. 12]. 


We consider here a function G(z) that is analytic at the origin and whose coeffi- 
cients [z”]G(z) are to be estimated by 


1 d 
8n = lz" GZ) = 7 [e@ ae 


The switch to polar coordinates is natural, so that the expansion of G(re!”) for small 0 
plays a central rdle: with r a positive real number lying within the disc of analyticity 
of G(z), the fundamental expansion is 


— ia)” 
(46) log G(re’”) = log G(r) + 9) av(r) 
v! 
v=1 
Not surprisingly, the most important quantities are the first two terms, and once G(z) 


has been put into exponential form, G(z) = et@) a simple computation yields 
alr) := ar) = rh'(r) 
(47) — = 2p" / . a 

bir) := alr) = reh"(r)+rh'(r), with h(z) := log G(z). 


In terms of G, itself, one thus has 
G! G! G" G! 2 
Ce bir) =r 2 2 - ( 2) 
G(r) G(r) G(r) Gir) 
Whenever G(z) has non-negative Taylor coefficients at the origin, b(r) is positive for 


r > 0 and a(r) increases as r > p, with p the radius of convergence of G. (This 
follows from the argument developed in Note VIII.4, p. 550.) 


(48) air) =r 
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Definition VIII.1 (Hayman—admissibility). Let G(z) have radius of convergence p 
with 0 < p < +00 and be always positive on some subinterval (Ro, p) of (0, p). The 
function G(z) is said to be H—admissible (Hayman admissible) if, with a(r) and b(r) 
as defined in (47), it satisfies the following three conditions: 
H,. [Capture condition] lim a(r) = +00 and lim b(r) = +00. 
rp rp 


H). [Locality condition] For some function @(r) defined over (Ro, p) and sat- 
isfying 0 < @ < a, one has 


G(re!?) ~ G(r) ela) O60)/2 asr > p, 


uniformly in |O| < O9(r). 
H3. [Decay condition] Uniformly in Op(r) < |O| < a 


G(re!®) = (<). 
(re’) =0 Br 


Note that the conditions in the definition are intrinsic to the function: they only 
make reference to the function’s values along circles, no parameter n being involved 
yet. It can be easily verified, from the previous examples, that the functions e, e® ~!, 


and e*+*"/2 are admissible with p = +00, and that the function e?/(!—2) is admissible 
with p = 1 (refer in each case to the discussion of the behaviour of the modulus of 
G(re’®), as @ varies). By contrast, functions such as e* ande® +e* are not admissible 
since they attain values that are too large when arg(z) is near z. 

Coefficients of H—admissible functions can be systematically analysed to first 
asymptotic order, as expressed by the following theorem: 
Theorem VIII.4 (Coefficients of admissible functions). Let G(z) be an H—admissible 
function and ¢ = ¢(n) be the unique solution in the interval (Ro, p) of the equation 


Ge) 
49 = 
= a6 meds 
The Taylor coefficients of G(z) satisfy, asin > oo: 
GC) 


2 
(50) gn =[z"]G@ ~ b@):= oo log G(z) + 2 log G(). 
dz dz 


cr /2n bd) 
Proof. The proof simply amounts to transcribing the definition of admissibility into 
the conditions of Theorem VIII.3. Integration is carried out over a circle centred at the 
origin, of some radius r to be specified shortly. Under the change of variable z = re!?, 
the Cauchy coefficient formula becomes 

—n 


+z ; 
1) 8n = [2"]G(zZ) = = / G(re!®)e—"? dé. 


In order to obtain a quadratic approximation without a linear term, one chooses 
the radius of the circle as the positive solution ¢ of the equation a(¢) = n, that is, a 
solution of Equation (49). (Thus ¢ is a saddle-point of G(z)z~”.) By the capture con- 
dition Hy, we have ¢ — p™ asn — +00. Following the general saddle-point strategy, 
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we decompose the integration domain and set, with 4 as specified in conditions Hy 
and H3: 


+00 ; ; 2a —O : : 
pnts / Gee" dd, I= i. Gee eo" dd. 
—O 


a) 


(i) Tails pruning. By the decay condition H3, we have a trivial bound, which 
suffices for our purposes: 


52 JO = ([2). 
we "VO 


(ii) Central approximation. The uniformity of the locality condition Hz implies 
+6 > 
(53) IO ~ GO) / oF O/? gg, 
—% 


(iii) Tails completion. A combination of the locality condition Hz and the decay 
condition Hg instantiated at 9 = 09, shows that b(¢)0? — +00 as n — +00. There 
results that tails can be completed back, and 


+6 +4o//d( 
(54) i ® 6-H? /2 gg ~ Ce Fae ees 
~% waa ~00/Jb@ Vb(r) 


From (52), (53), and (54) (or equivalently via an application of Theorem VII.3), 
the conclusion of the theorem follows. | 


OS 2: 
er leat, 


The usual comments regarding the choice of the function 6o(r) apply. Consider- 
ing the expansion (46), we must have a(r)O5 > co anda3 (r)05 — 0. Thus, in order 
to succeed, the method necessitates a priori a3(r)* / a2(r)> — 0. Then, 4 should be 
taken according to the saddle-point dimensioning heuristic, which can be figuratively 
summarized as® 


(55) KK 


1 1 
a(r)!/2 a3(r)l/3° 


a possible choice being the geometric mean of the two bounds 6 = a, i “ay ve 


The original proof by Hayman [325] contains in addition a general result that 
describes the shape of the individual terms g,r” in the Taylor expansion of G(r) as r 
gets closer to its limit value p: these appear to exhibit a bell-shaped profile. Precisely, 
for G with non-negative coefficients, define a family of discrete random variables X (r) 
indexed by r € (0, R) as follows: 


nr" 


Ga 


The model in which a random F structure with GF G(z) is drawn with its size being 
the random value X (7) is known as a Boltzmann model. Then: 


P(X) =n) = 


6We occasionally write A < B, equivalently, B >> A, if A = o(B). 
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Figure VIII.7. The families of Boltzmann distributions associated with involutions, 
G(z) = &t"/2 with r = 4..8, and set partitions, G(z) = e®—! with r = 2..3, 
obey an approximate Gaussian profile. 


Proposition VIII.5. The Boltzmann probabilities associated to an admissible function 
G(z) satisfy, as r — p~, a “local” Gaussian estimate; namely, 


Snr" 1 (a(r) - n) 
56 = ex = + € 3 
ie GQ) Jaxb@) r( Db(r) n 
where the error term satisfies €, = o(1) as r — p uniformly with respect to integers 
n; that is, lim;— p sup, |€n| = 0. 


The proof is entirely similar to that of Theorem VIII.4; see Note VIII.9 and Fig- 
ure VIII.7 for a suggestive illustration. 


> VII.9. Admissibility and Boltzmann models. The Boltzmann distribution is accessible from 
1 22-4 . ; 
ear = | G(re!? ein? dé. 
2x J—O 
The estimation of this integral is once more based on a fundamental split 
1 +0 1 22 —O 
enr” = JO 4 7D where JO — =| , Oe =| ; 
2a J_O 22 J+ 


and 6) = O(n) is as specified by the admissibility definition. Only the central approximation 
and tails completion deserves adjustments. The “locality” condition Hz gives uniformly in n, 


a 
7) = BO) [ jaw—-mo- S00? (1 4 (1) a0 
2x JO 


+60, +00 
= G(r) / 7 ei (a(r)—n)O— 5 b(r 0 dO +o / en bh)? do |. 
2a —0 —0o 


and setting (a(r) — n)(2/b(r))!/2 = Cc, we obtain 


+/br)/2 
(58) (Ox / et tte ae 00)| 


(57) 


/2b(r) | JO Vby72 
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The integral in (58) can then be routinely extended to a complete Gaussian integral, introducing 
only o(1) error terms, 


(0) _ Gr) We —t?+ict 
(59) = 7 JOB) (I oo e dt+o(1)}. 


‘ ice 2 e . 
Finally, the Gaussian integral evaluates to ./7e~° /4, as is seen by completing the square and 
shifting vertically the integration line. 


> VIIM.10. Hayman’s original. The condition Hy of Theorem VIII.4 can be replaced by 
i iti 1 _— 
H,. [Capture condition] ce b(r) = +00. 


That is, a(r) +00 is a consequence of H{,, Hz, and H3. (See [325, §5].) <q 


> VIIM.11. Non-admissible functions. Singularity analysis and H—admissibility conditions are 


in a sense complementary. Indeed, the function G(z) = (1 — z)~! fails to be be admissible 
fy 2€ 


1 
l-z Jn” 
corresponding to a saddle-point near 1 — n—!. The explanation of the discrepancy is as follows: 
Expansion (46) has a, (r) of the order of (1 — r)~”, so that the locality condition and the decay 
condition cannot be simultaneously satisfied. 

Singularity analysis salvages the situation by using a larger contour and by normalizing to 
a global Hankel Gamma integral instead of a more “local” Gaussian integral. This is also in 
accordance with the fact that the saddle-point formula gives, in the case of [z”](1 — z)~!, an 
estimate, which is within a constant factor of the true value 1. (More generally, functions of the 


form (1 — z) B are typical instances with too slow a growth to be admissible.) <i 


as the asymptotic form that Theorem VIII.4 would imply is the erroneous [z’’] 


Closure properties. An important aspect of Hayman’s work is that it leads to 
general theorems, which guarantee that large classes of functions are admissible. 


Theorem VIII.5 (Closure of H—admissible functions). Let G(z) and H(z) be admis- 
sible functions and let P(z) be a polynomial with real coefficients. Then: 

(i) The product G(z)H (z) and the exponential e© are admissible functions. 

(ii) The sum G(z) + P(z) is admissible. If the leading coefficient of P(z) is 
positive then G(z)P(z) and P(G(z)) are admissible. 

(iii) If the Taylor coefficients of e? © are eventually positive, then e? © is admis- 
sible. 


Proof. (Sketch) The easy proofs essentially reduce to making an inspired guess for 
the choice of the @ function, which may be guided by Equation (55) in the usual 
way, and then routinely checking the conditions of the admissibility definition. For 
instance, in the case of the exponential, K (z) = eG), the conditions H,, H2, H3 of 
Definition VIII.1 are satisfied if one takes 0)(r) = (G(r))~7/>. We refer to Hayman’s 
original paper [325] for details. | 


Exponentials of polynomials. The closure theorem also implies as a very special 
case that any GF of the form e?@ with P(z) a polynomial with positive coefficients 
can be subjected to saddle-point analysis, a fact first noted by Moser and Wyman [449, 
450]. 


Corollary VIII.2 (Exponentials of polynomials). Let P(z) = pa ajzi have non- 
negative coefficients and be aperiodic in the sense that gcd{j | a; # 0} = 1. Let 
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f(z) =e? Then, one has 
1 e?® 


AX 
= [z” ~ —— , h A={r—]) P), 
f=" @) ~ Ga where (: =) (r) 
and r is a function of n given implicitly by r 4 P(r) =n. 
The computations are in this case purely mechanical, since they only involve the 
asymptotic expansion (with respect to n) of an algebraic equation. 

Granted the basic admissibility theorem and closures properties, many functions 
are immediately seen to be admissible, including 


Z| 22 2 
e, eos, gre, 


which have previously served as lead examples for illustrating the saddle-point method. 
Corollary VII.2 also covers involutions, permutations of a fixed order in the symmet- 
ric group, permutations with cycles of bounded length, as well as set partitions with 
bounded block sizes: see Note VIII.12 below. More generally, Corollary VII.2 ap- 
plies to any labelled set construction, F = SET(G), when the sizes of G-components 
are restricted to a finite set, in which case one has 


Fim) — spr (U5219/) : = Fl"l(z) = exp > Gj— 


This covers all sorts of graphs (plain or functional) whose connected components are 
of bounded size. 


> VIIL12. Applications of “exponentials of polynomials”. Corollary VIII.2 applies to the 
following combinatorial situations: 


Permutations of order p (a? = 1) f(z) = exp (>; ti =) 


Permutations with longest cycle < p f(z) = exp (ee a 


Partitions of sets with largest block < p f(z) = exp ( ey =). 


For instance, the number of solutions of ¢? = 1 in the symmetric group is asymptotic to 


- p' exp(n'/?), 


n\n(-1/p) 
fz) 
for any fixed prime p > 3 (Moser and Wyman [449, 450]). <q 


Complete asymptotic expansions. Harris and Schoenfeld have introduced in [323] 
a technical condition of admissibility that is stronger than Hayman admissibility and 
is called H S—admissibility. Under such H S—admissibility, a complete asymptotic ex- 
pansion can be obtained. We omit the definition here due to its technical character but 
refer instead to the original paper [323] and to Odlyzko’s survey [461]. Odlyzko and 
Richmond [462] later showed that, if g(z) is H—-admissible, then f(z) = e&© is HS- 
admissible. Thus, taking H—admissibility to mean at least exponential growth, full 
asymptotic expansions are to be systematically expected at double exponential growth 
and beyond. The principles of developing full asymptotic expansions are essentially 
the same as the ones explained on p. 557—only the discussion of the asymptotic scales 
involved becomes a bit intricate, at this level of generality. 
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VIII. 5.2. Higher-level structures and admissibility. The concept of admissi- 
bility and its surrounding properties (Theorems VIII.4 and VHI5, Corollary VII.2) 
afford a neat discussion of which combinatorial classes should lead to counting se- 
quences that are amenable to the saddle-point method. For simplicity, we restrict 
ourselves here to the labelled universe. 

Start from the first-level structures, namely 


SEQ(Z), Cyc(Z), SET(Z), 
corresponding, respectively, to permutations, circular graphs, and urns, with EGFs 
1 1 
z 


, I ; 
= oS e 


The first two are of singularity analysis class; the last is, as we saw, within the reach 
of the saddle-point method and is H—admissible. 

Next consider second-level structures defined by arbitrary composition of two 
constructions taken among SEQ, CYC, SET; see Subsection II. 4.2, p. 124 for a pre- 
liminary discussion (In the case of the internal construction, it is understood that, for 
definiteness, the number of components is constrained to be > 1.) There are three 
structures whose external construction is of the sequence type, namely, 


SEQoSEQ, SEQoCyYc, SEQoSET, 


corresponding, respectively, to labelled compositions, alignments, and surjections. All 
three have a dominant singularity that is a pole; hence they are amenable to meromor- 
phic coefficient asymptotics (Chapters IV and V), or, with weaker remainder esti- 
mates, to singularity analysis (Chapters VI and VID). 
Similarly there are three structures whose external construction is of the cycle 
type, namely, 
CYCoSEQ, CycoCyc, CyYCoSET, 


corresponding to cyclic versions of the previous ones. In that case, the EGFs have 
a logarithmic singularity; hence they are amenable to singularity analysis, or, after 
differentiation, to meromorphic coefficient asymptotics again. 

The case of an external set construction is of interest. It gives rise to 


SEToSEQ, SEToCyc, SEToSET, 


corresponding, respectively, to fragmented permutations, the class of all permutations, 
and set partitions. The composition SET o CYC appears to be special, because of the 
general isomorphism, valid for any class C, 


SET(CYC(C)) = SEQ(C), 


corresponding to the unicity of the decomposition of a permutation of C—objects into 
cycles. Accordingly, for generating functions, an exponential singularity “simplifies”, 
when combined with a logarithmic singularity, giving rise to an algebraic (here polar) 
singularity. The remaining two cases, namely, fragmented permutations and set parti- 
tions, characteristically come under the saddle-point method and admissibility, as we 
have seen already. 
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Closure properties then make it possible to consider structures defined by an arbi- 
trary nesting of the constructions in {SEQ, CYC, SET}. For instance, “superpartitions” 
defined by 


S = SET(SETS\(SET21(Z))), = = S@) =e, 


are third-level structures. They can be subjected to admissibility theory and saddle- 
point estimates apply a priori. Notes VIII.14 and VHI.15 further examine such third- 
level structures. 

> VIIL.13. idempotent mappings. Consider functions from a finite set to itself (““mappings” or 
“functional graphs” in the terminology of Chapter II) that are idempotent, i.e., 6 od = g. The 


EGF is J (z) = exp(ze*) since cycles are constrained to have length | exactly. The function J (z) 
is admissible and 


n!} 
I, ~ =a ot D/C+)) 
: a/ 2a ne ¢ 
where ¢ is the positive solution of ¢(¢ + 1)eé = n+ 1. This example is discussed by Harris 
and Schoenfeld in [323]. <i 


> VIOI.14. The number of societies. A society on n distinguished individuals is defined by 
Sloane and Wieder [545] as follows: first partition the n individuals into non-empty subsets 
and then form an ordered set partition [preferential arrangement] into each subset. The class of 
societies is thus a third-level (labelled) structure, with specification and EGF 


1 
—1). 
2—e ) 


The counting sequence starts as 1, 1, 4, 23, 173, 1602 (EIS A075729); asymptotically 


S = SET (SEQs 1 (SETS (2Z))) => S(z) = exp ( 


ov 2n/log2 i eee 
a ee] el pGlee2, 
n3/4 (log 2)"+1/4 4/a \e 
(The singularity is of the type “exponential-of-pole” at z = log 2.) J 


> VIII.15. Third-level classes. Consider labelled classes defined from atoms (Z) by three 
nested constructions, each either a sequence or a set. All cases can be analysed, either by saddle- 
point and admissibility or by singularity analysis. Here is a table recapitulating structures, 
together with their EGF and radius of convergence (p): 


Saddle-point: SET(SETS 1 (SETs1(2Z))) eo p=o 
SET(SETS|(SEQ3;(Z))) 1 p= 
e—1 
SET(SEQ>1(SET>1(2)))  exp(,——7) pp = log2 


SET(SEQ>1(SEQ>1(Z))) e*/4-7)— pp =F 


Singularity analysis: SEQ(SETS;(SETs1(Z))) p = log log(2e) 


2— eel 
4 1 _ _log2 
EQ(SET3 1 (SEQs1(4))) 2 — e/U-a P = THog2 
2-& 
SEQ(SEQ>1(SET>1(Z))) =z = logs 
1 —2z 1 
SEQ(SEQs 1 (SEQ>1(4))) 1-3: ack: 


The outermost construction dictates the analytic type and precise asymptotic equivalents can be 
developed in all cases. dq 
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> VHOI.16. A Multiple Choice Questionnaire. Classify all the 27 third-level structures built 
out of {SEQ, CYC, SET}, according to whether they are of type SA (singularity analysis) or SP 
(saddle-point). 


> VII.17. A meta-MCQ. Among the 3” specifications of level n, what is the asymptotic pro- 
portion of those that are of type SP? dq 


VIII. 5.3. Analytic depoissonization. We conclude this section on methodology 
with a sketch of an approach to the analysis of exponential generating functions, 
which has been termed analytic depoissonization, by its proponents, Jacquet and Sz- 
pankowski [346, 564]. This approach, which is based on the saddle-point method, has 
affinities with admissibility theory and it plays a réle in the investigation of several 
important models of discrete mathematics. 

The Poisson generating function of a sequence (a,,) is defined as 

n 


nog 
a(z) = Doane care 
n!} 


n>0 


It is thus a simple variant of the EGF (multiply by e~*) and, when z assumes a non- 
negative real value 4, it can be viewed as a sum of the a,, weighted by the Poisson 
probabilities {e~4"/n!}. Since the Poisson distribution is concentrated around its 
mean value 4, it is reasonable to expect an approximation 


(60) a(A) ~ aa (A> ow) 


to be valid, provided a,, assumed to be known, varies sufficiently “regularly”. A 
statement granting us the correctness of (60), based on a priori knowledge of the ay, 
is an Abelian theorem, in the usual sense of analysis (see Section VI. 11, p. 433, and 
e.g., [69, $1.7]); it is easily established using the Laplace method for sums (p. 755), 
upon appealing to a Gaussian approximation of Poisson laws of large rate 4 (Note IX.19, 
p. 643). 

What is of interest here is the converse (Tauberian) problem: we seek ways of 
translating information on the Poisson generating function a(z) into an asymptotic 
expansion of the coefficients (a,). Beyond being fully in the spirit of the book (es- 
pecially, Chapters VI and VII), this situation is of interest, since it is encountered in 
many probabilistic contexts where a Poisson model intervenes. In this subsection, 
we stand on the shoulders of Jacquet and Szpankowski [346, 564], who developed a 
whole theory. 

A sector Sg, with ¢@ € R, is defined to be Sg = {z : |arg(z)| < d}. A func- 
tion f(z) is said to be small, away from the positive real axis, if, for some A > O and 
¢ € (0, z/2), one has 


le f(@| =O (e4"), as |zl> 00, z¢ Sy. 


We have [564, Th. 10.6]: 


Theorem VIII.6 (Analytic depoissonization). Let the Poisson generating function a(z) 
be small, away from the positive real axis, with sector Sg. Then one has the following 
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correspondence between properties of the individual terms in the expansion of a(z) 
within Sg and asymptotic terms in the expansion of the coefficient ay: 


a(z) an 
O(|lFllog@I°) —> 0 (n® dogny®) 
2 ee eee poo ee ee | 
4 2n 24n2 
2? (log z)" — ~ = (x! [ = re - 1) res -}) : 


Proof. (Sketch) Given the assumptions, we regard e*a(z) as a variant of the expo- 
nential function, to which the saddle-point method is known to be applicable: see 
the derivation of Example VIII.3 (p. 555), which we closely follow. Accordingly, we 
appeal to Cauchy’s formula, 


and integrate along the circle |z| = n. The smallness condition on a(z) ensures that 
the integral outside of Sg is exponentially negligible. Setting z = ne’, we see that, 
inside Sy, we can neglect the part corresponding to |@| > O(n) = n~*/>, since it is 
again exponentially small. Then, for the central part of the contour, 


t,—npn Oo : 1 ; 
o) ._ mn € / =n6?/2 O24 ope pe i) 40 
anes Dae Po e exp n[e iO + 5 | a(ne'’) dé, 


it suffices to perform the change of variables t = 0./n, make careful use of the as- 
sumed asymptotic approximation of a(z) in each of the three cases, and finally con- 
clude. | 


The estimates of Theorem VIII.6 are thus considerable refinements of (60). (To 
some probabilists, it may come as a surprise that one can depoissonize by making 
use of Poisson laws of complex rate!) Analytic depoissonization parallels the philos- 
ophy underlying singularity analysis as well as admissibility theory. Its merit is to 
be well-suited to solving a large number of problems arising in word statistics, the 
analysis of digital trees and distributed algorithms, as well as data compression: see 
Szpankowski’s book [564, Ch. 10] and the fundamental study [346] for applications 
and advanced results. 
> VII.18. The “Jasz” expansion. Jacquet and Szpankowski prove more generally that 


oo Uk 
an ~ a(n) + os Cheat (of a(2)) 


b 
iad 
k=li=l ek 


where cj, ; = [x! y/] exp(x log(1 + y) — xy), under suitable conditions on a(z). dq 


> VIIA9. The converse “Jasz” expansion. Jacquet and Szpankowski also give an Abelian 
result: 
oo ok 


alz) ~ gn) + >. > digziz’ oft’ g2), 


k=1 j=1 
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where dj; = Lx! y/]exp(x(e” — 1) — xy, the function g(z) extrapolates ap (i.e., dn = g(n)) 
to C, and suitable smoothness conditions on g are imposed. dq 


VIII. 6. Integer partitions 


We now examine the asymptotic enumeration of integer partitions, where the 
saddle-point method serves as the main asymptotic engine. The corresponding gener- 
ating function enjoys rich properties, and the analysis, which goes back to Hardy and 
Ramanujan in 1917, constitutes, as pointed out in the introduction of this chapter, a 
jewel of classical analysis. 

Integer partitions represent additive decompositions of integers, when the order 
of summands is not taken into account. When all summands are allowed, the specifi- 
cation and ordinary generating function are (Section I. 3, p. 39) 


co 


(61) P=MSeET(SEQs:(Z)) => P=] = 
m=1 


which, by the exp—log transformation, admits the equivalent form 


P(z) = exp >> log(1 —z™)~! 
(62) m=1 


Ze ap 2 ip 2 
— ex + * A 
Pie ee “ee 


From either of these two forms, it can be seen that the unit circle is a natural boundary, 
beyond which the function cannot be continued. The second form, which involves 
the quantity exp(z/(1 — z)) is reminiscent of the EGF of fragmented permutations, 
examined in Example VII.7, p. 562, to which the saddle-point method could be suc- 
cessfully applied. 

In what follows, we show (Example VIII.8 below) that the saddle-point method is 
applicable, although the analysis of P(z) near the unit circle is delicate (and pregnant 
with deep properties). The accompanying notes point to similar methods being appli- 
cable to a variety of similar-looking generating functions, including those relative to 
partitions into primes, squares, and distinct summands, as well as plane partitions: see 
Figure VIII.8 for a summary of some of the asymptotic results known. 


Example VUI1.8. Integer partitions. We are dealing here with a famous chapter of both asymp- 
totic combinatorics and additive number theory. A problem similar to that of asymptotically 
enumerating partitions was first raised by Ramanujan in a letter to Hardy in 1913, and subse- 
quently developed in a famous joint work of Hardy and Ramanujan (see the account in Hardy’s 
Lectures [321]). The Hardy-Ramanujan expansion was later perfected by Rademacher [22] 
who, in a sense, gave an “exact” formula for the partition numbers Py. 

A complete derivation with all details would consume more space than we can devote to 
this questions. We outline here the proof strategy in such a way that, hopefully, the reader can 
supply the missing details by herself. (The cited references provide a complete treatment). 

As before, we start with simple saddle-point bounds, already briefly discussed on p. 248. 
Let P, denote the number of integer partitions of n, with OGF as stated in (61). A form 
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Summands specification asymptotics 
1 
all, Z>4 MSET(SEQs1(Z))| = ——=e"¥7"/3 | Ex. VIIL8, p. 574 
- FA 4nJ/3 
a 1 
all distinct, Z>1 PSET(SEQ>1(Z)) rere is Note VIII.24, p. 579 
squares, 1, 4,9, 16, --- Cn~/6 eK? | Note vitt.24, p. 579 
primes, 2,3,5,7,... log pp Cu nie Note VIII.26, p. 580 
logn 
(log)? 
powers of two, 1,2,4,... log Ma, ~ Note VIII.27, p. 581 
2log2 
plane (TI (1- 7) cn 25/3602" | Note VIIL25, p. 580 
m 


Figure VIII.8. Asymptotic enumeration of various types of partitions. 


amenable to bounds is derived from the exp—log reorganization (62), which yields 


P(2) = ( 1 ): ane a os a tide 
pr EN Nate JON T OUh aay” SC iaae ge) 


The denominator of the general term in the exponential satisfies, for x € (0, 1), the inequalities 
mxMl < (Q+x+---+ xml) < m, so that 
1 x 1 x” 
—z > log P(x) > —=- 
D, 32 eP@)> TD 


m>1 m>1 


(63) 


l-x 


This proves for real x — 17 that 


2 
1 

64 P = exp{ ———- (1+ o(1) J, 
(64) (x) (as of ) 
given the elementary identity >> m2 = 1? /6. The singularity type at z = 1 resembles that 
of fragmented permutations (p. 562), and at least the growth along the real axis is similar. An 
approximate saddle-point is then 

oa 1 
(65) (n) =1-—, 
/6n 


which gives a saddle-point bound 
(66) Pn < exp (7/2/31 Mi o(1)) 


Proceeding further involves transforming the saddle-point bounds into a complete saddle- 
point analysis. Based on previous experience, we shall integrate along a circle of radius r = 
C(n). To do so, two ingredients are needed: (i) an approximation in the central range; (i7) bounds 
establishing that the function P(z) is small away from the central range so that tails can be first 
neglected, then completed back. Assuming the expansion (62) to lift to an area of the complex 
plane near the real axis, the range of the saddle-point should be analogous to that already found 
for exp(z/(1 — z)), so that Og = n—’/!9 will be adopted. Accordingly, we choose to integrate 
along a circle of radius r = C(n) given by (65) and define the central region by 09 = no 10, 
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Under these conditions, the central region is seen under an angle that is O(n—!/>) from the 
point z= 1. 

(i) Central approximation. This requires a refinement of (64) till o(1) terms as well as an 
argument establishing a lifting to a region near the real axis. We set z = e~’ and start with 


t > 0. The function 


ewmt 


m(1 — e~™) 


L(t) := log P(e) = S 

m>1 

is a harmonic sum which is amenable to Mellin transform techniques (as described in Appen- 

dix B.7: Mellin transforms, p. 762; see also p. 248). The base function is e~‘/(1 — e~“), the 

amplitudes are the coefficients 1 /m and the frequencies are the quantities m figuring in the expo- 

nents. The Mellin transform of the base function, as given in Appendix B (p. 763), is (s)¢(s). 

The Dirichlet series associated to the amplitude frequency pairs is >~ mms = f(s +1), so 
that 

L*(s) = C(s)C(s + 1I)F(). 


Thus L(t) is amenable to Mellin asymptotics and one finds 


2 
Tw 1 1 ”) 
7 L(t) = — + = logt —log V2 — —t t t30T 
(67) C= a ts eet leven tt OF), 20", 
from the poles of L*(s) at s = 1,0, —1. This corresponds to an improved form of (64): 
(68) log P(z) x wi ioe 23) my V2n + O(1—2) 
C) = ——— + - log(l —z)- — -lo —2Z). 
Breage Be oh ME eee s 


At this stage, we make a crucial observation: The precise estimate (67) extends when t lies 
in any sector symmetric about the real axis, situated in the half-plane R(t) > 0, and with an 
opening angle of the form x — 6 for an arbitrary 6 > 0. This is derived from the fact that 
the Mellin inversion integral and the companion residue calculations giving rise to (67) extend 
to the complex realm as long as | arg(t)| < 5 - 50. (See Appendix B.7: Mellin transforms, 
p. 762 or the article [234].) Thus, the expansion (68) holds throughout the central region given 
our choice of the angle 09. The analysis in the central region is then practically isomorphic to 
that of exp(z/(1 — z)) in the previous example, and it presents no special difficulty. 


(ii) Bounds in the non-central region. This is here a non-trivial task since half of the 
factors entering the product form (61) of P(z) are infinite at z = —1, one third are infinite at 
z = e*4/t/3\ and so on. Accordingly, the landscape of |P(z)| along a circle of radius r that 
tends to 1 is quite chaotic: see Figure VIII.9 for a rendering. It is possible to extend the analysis 
of log P(z) near the real axis by way of the Mellin transform to the case z = e'-'P ast > 0 
and ¢ = 2z A is commensurate to 27. In that case, one must operate with 


1 ev 7m(tti¢) 


1 —mk(t+i¢) 
Lg) = > m1—e-m(+id) — 5 » Pre eta 


m21 m>1k>1 


which is yet another harmonic sum. The net result is that when |z| tends radially towards a : 
then P(z) behaves roughly like 


m2 
69 —.———_ }, 
a PN 620 = lel 


which is a power 1/g2 of the exponential growth as z > 17. This analysis extends next to 
a small arc. Finally, consider a complete covering of the circle by arcs whose centres are of 
argument 27j/N, 7 = 1,...,N — 1, with N chosen large enough. A uniform version of the 


VII. 6. INTEGER PARTITIONS S77 


Fl 

FO. 

c 

Fo 

E-0 

c 
sa a 

= S005 0 0.5 1 -3 -2 -1 0 1 2 3 
x th 


Figure VIIL9. Integer partitions. Left: the surface |P(z)| with P(z) the OGF of 
integer partitions. The plot shows the major singularity at z = 1 and smaller peaks 
corresponding to singularities at z = —1, e*2!7/3 and other roots of unity. Right: a 
plot of P(re'®) as a function of @, for various values r = 0.5,..., 0.75, illustrates 
the increasing concentration property of P(z) near the real axis. 


bound (69) makes it possible to bound the contribution of the non-central region and prove it 
to be exponentially small. There are several technical details to be filled in order to justify this 
approach, so that we switch to a more synthetic one based on transformation properties of P(z), 
following [14, 17, 22, 321]. (Such properties also enter the Hardy-Ramanujan—Rademacher 
formula for P,, in an essential way.) 

The fundamental identity satisfied by P(z) reads 
(70) P(e"T) = Jeexp (= (- - ‘)) Peete), 


which is valid when %(z) > 0. The proof is a simple rephrasing of a transformation formula of 
Dedekind’s 7 (eta) function, summarized in Note VIII.20 below. 


> VII.20. Modular transformation for the Dedekind eta function. Consider 


CO 
is=g [Ga ).. =e", 


m=1 


with S(z) > 0. Then 7(z) satisfies the “modular transformation” formula, 


my n(-1) = [En 


This transformation property is first proved when 7 is purely imaginary, i.e., t = it, then 
extended by analytic continuation. Its logarithmic form results from a residue evaluation of the 
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integral 


1 | s ds 
—— cotzscotz— —, 
2ni Jy tT S 
with y a large contour avoiding poles. (This elementary derivation is due to C. L. Siegel. The 
function 7(c) satisfies transformation formulae under $ : t+» t+1 and T: th —1/t, which 
generate the group of modular (in fact “unimodular”’) transformations t +» (at + b)/(ct +d) 
with ad — bc = 1. Such functions are called modular forms.) 

Given (70), the behaviour of P(z) away from the positive real axis and near the unit circle 
can now be quantified. Here, we content ourselves with a representative special case, the situa- 
tion when z > —1. Consider thus P(z) with z = e2tt+it where, for our purposes, we may 
take t = 1/./24n. Then, Equation (70) relates P(z) to P(z’), with 7 =t —i/2 and 


Thus |z’| + 1 as t > 0 with the important condition that |z’| — 1 = O((|zl — 11/4). In other 
words, z’ has moved away from the unit circle. Thus, since | P(z’)| < P(|z’|), we may apply 
the estimate (68) to P(|z’|) to the effect that 


7 _jt+ 
log |P(z)| < 340 = Jay 1 +)» (z> -1"). 


This is an instance of what was announced in (69) and is in agreement with the surface plot of 
Figure VIIL9. The extension to an arbitrary angle presents no major difficulty. 


The two properties developed in (7) and (ii) above guarantee that the approximation (68) 
can be used and that tails can be completed. We find accordingly that 


2 
2 
Py ~ [2"]e7® (2/7 — zexp ence 
6(1 — z) 
All computations done, this provides: 


Proposition VIII.6. The number pp of partitions of integer n satisfies 


0° 
1 1 
72 = [z” ~ et V2n/3 
(72) mel a~ eae 


The singular behaviour along and near the real line is comparable to that of exp((1—z)~!), 
which explains a growth of the form i ECE ACE ETON TTT BOON SET | 


The asymptotic formula (72) is only the first term of a complete expansion involv- 
ing decreasing exponentials that was discovered by Hardy and Ramanujan in 1917 and 
later perfected by Rademacher (see Note VIII.22 below). Whereas the full Hardy— 
Ramanujan expansion necessitates considering infinitely many saddle-points near the 
unit circle and require the modular transformation of Note VIII.20, the main term 
of (72) only requires the asymptotic expansion of the partition generating function 
near z= 1. 

The principles underlying the partition example have been made into a general 
method by Meinardus [434] in 1954. Meinardus’ method abstracts the essential fea- 
tures of the proof and singles out sufficient conditions under which the analysis of 
an infinite product generating function can be achieved. The conditions, in agree- 
ment with the Mellin treatment of harmonic sums, require analytic continuation of the 
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Dirichlet series involved in log P(z) (or its analogue), as well as smallness towards 
infinity of that same Dirichlet series. A summary of Meinardus’ method constitutes 
Chapter 6 of Andrews treatise on partitions [14] to which the reader is referred. The 
method applies to many cases where the summands and their multiplicities have a 
regular enough arithmetic structure. 


> VII.21. A simple yet powerful formula. Define (cf (321, p. 118]) 


Be 1d (eK % pa [1 
= — | — ]}, =7,/ 75, An s=4/n- >. 
"I gJ2dn\ An 32 74 


Then PX approximates Py, with a relative precision of order e~¢V" for some c > 0. For 


instance, the error is less than 3 - 1078 for n = 1000. [Hint: The transformation formula makes 
it possible to evaluate the central part of the integral giving P, very precisely.] dq 


> VIWL.22. The Hardy—Ramanujan—Rademacher expansion. The number of integer partitions 
satisfies the exact formula 


: 2 1 
d sinh(Z,/ 3(n — 3)) 
dn 1 
where Ax(n) = Ds One 
h mod k,gcd(h,k)=1 


? 


1 CoO 
Py = i 2 A, (n)Vk 


k-1 
h 
pk 1S a 24th root of unity, wp .~ = exp(zis(h, k)), and sp x~ = 3 (Fy {{ =} is known as a 
w=1 
Dedekind sum, with {x}} = x — |x] — }. Proofs are found in [14, 17, 22, 321]. J 


> VII1.23. Meinardus’ theorem. Consider the infinite product (an > 0) 
CO 
f@®=][a-“y™. 
n=1 


a ‘ . 
The associated Dirichlet series is a(s) = >> —. Assume that a(s) is continuable into a 
px 
n>1 
meromorphic function to R(s) > —Cp for some Co > 0, with only a simple pole at some 
p > 0 and corresponding residue A; assume also that a(s) is of moderate growth in the half- 


plane, namely, a(s) = O(|s|“'), for some Cy > 0 (as |s| > 00 in R(s) > —Cg). Let 
g(zy= Dine Gnz" and assume a concentration condition of the form 

gle MFP) — gle?) < — Cay. 
Then the coefficient fy = [z”] f(z) satisfies 


fn = Cn* exp (Kne/@+)) ,  K=(+p7)far@t yew +n]/er?, 
The constants C, « are: 
/ a(0)-1-4 
C= 6 Om (1 + py)" arp + Dee + NEO), yp = AON 3h Z 2P 
p 
Details of the concentration condition, and error terms are found in [14, Ch 6]. <i 


> VIIL.24. Various types of partitions. The number of partitions into distinct odd summands, 
squares, cubes, triangular numbers, are essentially cases of application of Meinardus’ method. 
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For instance the method provides, for the number Q,, of partitions into distinct summands, the 
asymptotic form 
# ezvn /3 
m>1 
The central approximation is obtained by a Mellin analysis from 


oo (-1)"7! ent 


L(t) = log Qe") = > L*(s) =T(s)e(s)e(s + 1) - 27), 


m l—evm? 
5 m=1 
U 1 
L(t) ~ —- 1 2+ —t.. 
~ Ty ~ es v2+ 
(See the already cited references [14, 17, 22, 321].) <J 


> VIII.25. Plane partitions. A plane partition of a given number n is a two-dimensional array 
of integers n;, ; that are non-increasing both from left to right and top to bottom and that add up 
ton. The first few terms (EZS A000219) are 1, 1, 3, 6, 13, 24, 48, 86, 160, 282, 500, 859 and P. 
A. MacMahon proved that the OGF is 


R@ = [[a-2"~™. 
m=1 


Meinardus’ method applies to give 
Rn ~ (¢(3)27 11)1/36,- 25/36 exp (3 : 2-2/3¢3)'/3n2/3 af 2c) ; 
where c = — 77 (log 2x) +y—1). 
(See [14, p. 199] for this result due to Wright [617] in 1931.) <q 


> VIII.26. Partitions into primes. Let pa be the number of partitions of n into summands 


that are all prime numbers, 
CoO 


pdb (z) = I] 
m=1 


where pm is the mth prime (p; = 2, p2 = 3, ...). The sequence starts as (EJS A000607): 
1,0,1,1,1,2,2,3,3,4,5,6, 7,9, 10, 12, 14, 17, 19, 23, 26, 30, 35, 40. 


1 
1 — zPm 


Then 


(73) ig PY Se 
3logn 


An upper bound of a form consistent with (73) can be derived elementarily as a saddle-point 
bound based on the property 


Sere —. t> 0. 

n>1 Oey 

This last fact results either from the Prime Number Theorem or from a Mellin analysis based 
on the fact that IT(s) := >* p,°® satisfies, with (m) the Mobius function, 


CO 
TI(s) = ys “(m) log ¢(ms). 
m=1 
(See Roth and Szekeres’ study [519] as well as the articles by Yang [625] and Vaughan [593] 
for relevant references and recent technology.) The present situation is in sharp contrast with 
that of compositions into primes (see Chapter V, p. 297), for which the analysis turned out to 
be especially easy. dq 
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> VIII.27. Partitions into powers of 2. Let My be the number of partitions of integer n into 


summands that are powers of 2. Thus M(z) = [],,590 — ey. The sequence (M7) starts 
as 1, 1,2, 2,4, 4, 6, 6, 10 (EJS A018819). One has 


Le | ema, ey oem es Ley + O(log logn) 
oO ogn og logn). 
Dlog2 \ = iogn Stood” toed J © on 


De Bruijn [141] determined the precise asymptotic form of M2,. (See also [179] for related 
problems.) dq 


log M2n, = 


Averages and moments. Based on the foregoing analysis, it is possible to perform 
the analysis of several parameters of integer partitions (see also our general discussion 
of moments in Subsection VIII.9.1, p. 594). In particular, it becomes possible to 
justify the empirical observations regarding the profile of partitions made in the course 
of Example III.7, p. 171. 


> VIII.28. Mean number of parts in integer partitions. The mean number of parts (or sum- 
mands) in a random integer partition of size n is 


1 2: 

—J/nlogn + O(n'/?), K=n2,/-. 

K 3 
For a partition into distinct parts, the mean number of parts is 


2/3 log 2 
21S 10D cet 

1 
The complex analytic proof starts from the BGFs of Subsection IIL. 3.3, p. 170 and, analytically, 
it only requires the central estimates of log P(e~‘) and log Q(e~‘), given the concentration 
properties, as well as the estimates 


y ent 7 —logt+y nm 1 yen! ent ra log2 : 1 
1—e7m t 4’ l—e7™ t 4? 
m>1 m>1 


which result from a standard Mellin analysis, the respective transforms being 


Tis)e(sy’, — F(s)(1 = 2! )g(s)?. 
Full asymptotic expansions of the mean and of moments of any order can be determined. In 
addition, the distributions are concentrated around their mean. (The first-order estimates are 
due to Erdés and Lehner [194] who gave an elementary derivation and also obtained the limit 
distribution of the number of summands in both cases: they are a double exponential (for P) 
and a Gaussian (for Q).) <i 


VIII. 7. Saddle-points and linear differential equations. 


The purpose of this section is to complete the classification of singularities of 
linear ordinary differential equations (see Subsection VII. 9.1, p. 518 for the so-called 
“regular” case) and briefly point to potentially useful saddle-point connections. What 
is given is, once more, a linear differential equation (linear ODE) of the form 


d 

dz 

(cf Equation (114), p. 519) and a simply connected open domain © where the coef- 
ficients d;(z) are meromorphic. It is assumed that the coefficients d;(z) have a pole 
at a single point ¢ € Q and are analytic elsewhere. As we know, it is only at such a 
point ¢ that singularities of solutions may arise. 


(74) a'¥(z)+d\(z)0" '¥(Q)+---+4,¥@)=0, a= 
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Consider for instance the ODE 
(75) (l-z)¥'@) —- 2-2Y@) =0, 


in a neighbourhood of ¢ = 1. The method of trying to match an approximate solution 
of the form (z — 1)? for some 6 € C does not succeed: there is no way to find a value 
of 6 for which there is a cancellation between two terms in the main asymptotic order. 
Accordingly, the conditions of Definition VII.7, p. 519, relative to regular singularities 
fail to be satisfied: in such cases, we say that the point ¢ is an irregular singularity of 
the linear ODE. In fact, the solution of (75), together with y(0) = 1, is explicit (see 
also Example VIUI.13 and Note VII.43, p. 597): T(z) = 1/(. — z) exp(z/(1 — z)). 
Thus, we encounter an exponential-of-pole singularity rather than the plain algebraic— 
logarithmic singularity that prevails in the regular case. The general case is hardly 
more complicated to state’. 


Theorem VIII.7 (Structure theorem for irregular singularities). Let there be given a 
differential equation of the form (74), a singular point ¢, and a sector S with vertex 
at ¢. Then, for z in a sufficiently small sector S' of S and for |z — ¢| sufficiently 
small, there exists a basis of d linearly independent solutions of (74), such that any 
solution Y in that basis admits, as z > ¢ in S', an asymptotic expansion 


(76) = Y¥(@) ~exp(P(Z~"""))Z* } QjNogZ)Z*, — Z:=(@- 0), 


where P is a polynomial, r an integer of Z>o0, a is a complex number, s is a rational 
number of Qs, and the Qj; are a family of polynomials of uniformly bounded degree. 


Proof. The proof [602, p. 11] starts by constructing a basis of formal solutions, each of 
the form (76), by the method of indeterminate coefficients and exponents. It continues 
by appealing to a summation mechanism that transforms such formal solutions into 
actual analytic ones. (The restriction of the statement to sectors is inherent: it is 
related to what is known as the “Stokes phenomenon”® of ODE theory [602, §15].) 


In particular, if the polynomial P that intervenes in the expansion (76) has a 
positive leading coefficient and the sector is large enough, then the intervening quan- 
tities are Hayman admissible. In this way, up to (possibly difficult) connection prob- 
lems, the coefficients of solutions to meromorphic ODEs can in principle be analysed, 
whether the singularities be of the regular or irregular type. Indeed, proceeding at 
least formally (see the analysis of fragmented permutations in Example VIII.7, p. 562 
and Note VIII.7, p. 563 for similar computations) suggests that the coefficients of a 
solution to a linear ODE with meromorphic coefficients are finite linear combinations 
of asymptotic elements of the form 


(77) co exp(R(n'/?))n® S° Sj(logn)n!”, 


where R is a polynomial, p an integer of Z>0, a is a complex number, o is a rational 
number of Q;0, and the S; are a family of polynomials of uniformly bounded degree. 


7Singularities at infinity can be transformed into singularities at 0 via Z := 1/z. 
8The Stokes phenomenon is roughly the fact that solutions of an ODE with irregular singular points 
may involve certain discontinuities in asymptotic expansions, relatively to different sectors. 
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(The case of entire functions with an irregular singularity at infinity further introduces 
multipliers in the form of fractional powers of n!.) 
The fact that expansions of the type (77) hold in all generality is probably true, 

but far from being accepted as a theorem by experts. Odlyzko [461, p. 1135-1138], 
Wimp [610, p. 64], and Wimp—Zeilberger [611] offer a lucid (and prudent) discussion 
of these questions. The result (77) was claimed by G.D. Birkhoff and Trjitzinsky [70, 
71], based directly on their general theory of analytic difference equations, but in 
Wimp’s words (footnote on p. 64 of [610]): 

“Some now believe that the Birkhoff—Trjitzinsky theory has disabling gaps, see 

[342]. The alleged deficiencies are difficult to discern by a casual inspection of 

the papers [70, 71] since they are extremely long and their arguments are very 

laborious. My policy is not to use the theory unless its results can be substantiated 

by other arguments.” 
A sound strategy consists in basing an analysis of linear ODEs with an irregular singu- 
larity on the well-established Theorem VIII.7 and accordingly work out local singular 
expansions. Then determine a suitable integration contour for the Cauchy coefficient 
formula that wanders from valley to valley, and estimate the local contribution of each 
singularity that has an exponential growth by means of the saddle-point method—for 
regular singularities, use a Hankel contour, as in Subsection VII.9.1, p. 518. (As 
already noted, this may involve delicate connection problems as well as difficulties 
related to the Stokes phenomenon.) The positivity attached to combinatorial problems 
can often be used to restrict attention to asymptotically dominant solutions. Estimates 
involving asymptotic elements of the form (77) must eventually result, whenever the 
strategy is successful. This is in particular applicable to holonomic sequences and 
functions in the sense of Appendix B.4: Holonomic functions, p. 748. 


Example VIIL9. Symmetric matrices with constant row sums. Let Vx,n be the class of n x n 
symmetric matrices with non-negative integer entries and all row sums (hence also column 
sums) equal to k. The problem is to determine the cardinalities Y; , for small values of k. It 
is equivalent to determining the number of (regular, undirected) multigraphs, where all vertices 
have degree exactly k. We let Y;(z) represent the corresponding EGF. 

For all k, the EGF Y;(z) is holonomic; that is, it satisfies a linear ODE with polynomial 
coefficients. This results from Gessel’s theory of holonomic symmetric functions (p. 748). We 
follow here Chyzak, Mishna, and Salvy [122], who developed an original class of effective 
algorithms, which inter alia provide a means of computing the Y;. The cases k = 1 and k = 2 
succumb to elementary combinatorics, but the problem becomes non-trivial as soon as k > 3. 
We consider here k = 1, 2, 3. 


Case k = 1. A matrix of Vy is none other than a symmetric permutation matrix, which is 


bijectively associated with an involution, so that Yj (z) = ect<”/2. In that case, the saddle-point 
method applied to the entire function Yj (z) yields (Example VIII.5, p. 558): 


(78) y Ls pe" 
1,n (Sex) 1/4" Za 
Case k = 2. This one is a classic of combinatorial theory [554, pp. 16-19]. A matrix 
of Y2_, is the incidence matrix of a multigraph in which all vertices have degree exactly equal 
to 2. A bit of combinatorial reasoning (compare with 2-regular graphs in Note II.22, p. 133) 
shows that connected components can be only one of four types: 
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single nodes undirected segments 2-cycles undirected cycles of length > 3 
\ 
aa 1 
(oe ee) < Y 
LZ? z? 1 1 z 2 
Zz = — log ---—. 
21-z 2 2 l1-z 2 4 


(The corresponding EGFs are given by the last line; their sum provides log Y>(z).) Thus, after 
simplifications, we obtain 


1 1% 
79 Yo(z) = ex + : 
(79) 2(z) is r( rae ee :) 
The sequence Y2_, starts as 1, 1, 3, 11, 56, 348 (E7S 4000985). An asymptotic estimate results 
from an analysis entirely similar to that of fragmented permutations (Example VIIL7, p. 562), 


since the singularity is of an “exponential-of-pole type”, only modulated by a function of mod- 
erate growth (1 — zyl/?, We find: 


vi 

2/mn 

Case k = 3. Chyzak, Mishna, and Salvy determined that Y = Y3 satisfies the linear ODE 
$2 (Z)02Y (z) + $1 @)0¥(z) + bo @) = 0, 


where the coefficients are as in the following table: 


(80) Yon ~n! 


doz) = zl 4719-629 — 428 + 1127 — 1526 4 825 — 223 + 1222 — 242 — 24 
b(z) = —32(z! — 228 4.276 — 625 + 8c4 + 223 + 82? + 162 — 8) 
doz) = 92(24 — 2747-2). 


The first values of Y3 yn are 1,1, 4, 23, 214, 2698. Based on analogy with (78) and (80) supple- 
mented by rough combinatorial bounds, we expect the sequence ¥3 ,, to have a growth compa- 


rable to n!>/?; that is, the EGF Y3(z) has radius 0. The authors of [122] then opt to introduce a 
modified GF, obtained by a Hadamard product, 


Bu yn gent 
Y. = Y pai 2 et Le 
3) = ¥3@) © 20 3-4-9n tLe On FD 


n>0 


whose radius of convergence is finite and non-zero. Thanks to dedicated symbolic computation 
algorithms and programs, they determine that Y = Y3 satisfies a linear ODE order 29, 


28 
27132? — 4)°a29 F(z) + > $j Ez V(z) =0, 
j=0 
with coefficients $; (z) of degree 37(!). This corresponds to a dominant singularity at ¢ = 


2/./3, while the square factor (327 — 4)? betrays an irregular singularity. A local analysis of 
the ODE then reveals the existence of exactly one singular solution at ¢ (up to a multiplicative 
constant), 


3 1/2 145 8591 _> 
& aeen 4 [eee ee Z:=1- 
o(z) ~ exp (; ) i 41472 = , Z/C, 
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whose form is in general agreement with Theorem VIII.7. We must then have Y3 (z) ~ Ao(z) 
as z — ¢, for some constant 2 > 0, and a similar analysis applies to the conjugate root 
¢’ = —2//3. The form obtained for Y3 (z) is of the exponential-of-pole type, hence amenable 
to a saddle-point analysis. Omitting intermediate computations, one finds eventually 


n 
oe 32( V3 \ exp(V3n) 
(81) ¥3.n ~ C3n! ( 5 ) — ae? 
for a connection constant C3 that is determined numerically: C3 = 0.37720. ............. | 


> VIIL.29. An asymptotic pattern. Based on (78), (79), (81), and further (heavier) computations 
at k = 4, Chyzak et al. [122] observe the general asymptotic pattern: 


y, C yk/2 Kk/2 ” exp(/kn) a 1 ek(k-2)/4 
nk ~ CMON TY eg Ok = ET 


This asymptotic formula is indeed valid for each fixed k: it results from estimates of Bender 
and Canfield [39]. Although it is here limited to small values of k, the method of Chyzak et 
al. still has two advantages: (i) the exact values of the counting sequence are computable in 
a linear number of arithmetic operations; (ii) complete asymptotic expansions can be obtained 
comparatively easily. dq 


> VII30. The number of regular matrices. The asymptotic enumeration of regular (non- 
symmetric) matrices is treated by Békéssy, Békéssy, and Kémlos in [32] and by Bender in [37]. 
Combining their results with estimates of Bender and Canfield [39] yields the following table 
of asymptotic values for the number of regular matrices with row and column sums equal to k: 


(0, 1)-entries non-negative entries 
= n 
Symmetric | e~ #04, Fin EDIT kn (REY exp(Vkn) 
(k!)” J2 (20)k/4 ; k} nk/4 
Non-sym. eo K—1?/2 - (nk)! (ay e(k-1)"/2 - (nk)! (a eck 


(There, J, is the number of involutions of size n; see Proposition VIII.2, p. 560.) Thus the 
number of regular graphs, either directed or undirected, and with or without multiple edges, is 
asymptotically known. 


> VIIL31. Multidimensional integral representations. It is of interest to observe the multidi- 
mensional contour integral representation 


1 1 1 dx, ---dxn 
ten = Ge || TW (Goa) (Gx) . 
(2ix)" ee 1 = xjXx; f 1-x; il meer 


in connection with the advanced saddle-point methods methods of McKay and his coauthors [296, 
432]. Find similar integral representations for all the cases of Note VIII.30 above. dq 


VIII. 8. Large powers 


The extraction of coefficients in powers of a fixed function and more generally 
in functions of the form A(z)B(z)” constitutes a prototypical and easy application of 
the saddle-point method. We will accordingly be concerned here with the problem of 
estimating 


- 1 n az 
(82) [-NIAG)- BE" = = $ AC BE!" ay. 


586 VIII. SADDLE-POINT ASYMPTOTICS 


as both n and WN get large. This situation generalizes directly the example of the 
exponential and its inverse factorial coefficients, where we have dealt with a coeffi- 
cient extraction equivalent to [z”](e*)” (see pp. 549 and 555), as well as the case of 
the central binomial coefficients (p. 549), corresponding to [z”](1 + z)*”. General 
estimates relative to (82) are derived in Subsections VIII. 8.1 (bounds) and VUI 8.2 
(asymptotics). We finally discuss perturbations of the basic saddle-point paradigm in 
the case of large powers (Subsection VIII. 8.3): Gaussian approximations are obtained 
in a way that generalizes “local” versions of the Central Limit Theorem for sums of 
discrete random variables. This last subsection paves the way for the analysis of limit 
laws in the next chapter, where the rich framework of “quasi-powers” will be shown 
to play a central rdle in so many combinatorial applications. 


VIII. 8.1. Large powers: saddle-point bounds. We consider throughout this 
section two fixed functions, A(z) and B(z) satisfying the following conditions. 


Lj: The functions A(z) = >’ ;59 ajz/ and B(z) = Dd j>0 bjz/ are analytic at 0 
and have non-negative coefficients; furthermore it is assumed (without loss 
of generality) that B(O) 4 0. 

L2: The function B(z) is aperiodic in the sense that gcd {j | bj > Oo} =) 1: 
(Thus B(z) is not a function of the form #(z”) for some integer p > 2 and 
some # analytic at 0.) 

L3: Let R < oo be the radius of convergence of B(z); the radius of convergence 
of A(z) is at least as large as R. 


Define the quantity T called the spread: 


(83) T := lim 


Our purpose is to analyse the coefficients 
[2] AG) - B®)", 


when N and are linearly related. The condition N < Tn will be imposed: it is both 
technically needed in our proof and inherent in the nature of the problem. (For B a 
polynomial of degree d, the spread is T = d; for a function B whose derivative at its 
dominant positive singularity remains bounded, the spread is finite; for B(z) = e* and 
more generally for (non-polynomial) entire functions, the spread is T = 00.) 
Saddle-point bounds result almost immediately from the previous assumptions. 


Proposition VIII.7 (Saddle-point bounds for large powers). Consider functions A(z) 
and B(z) satisfying the conditions Ly, Lz, L3 above. Let 4 be a positive number with 
0 </A <T and let ¢ be the unique positive root of the equation 


BC) _ 
BC) 


¢ 
Then, for N = An an integer, one has 


[z4JA(z)- BQ)” < AMBO"C*. 
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Proof. The existence and unicity of ¢ is guaranteed by an argument already encoun- 
tered several times (Note VIII.46, p. 280, and Note VIII.4, p. 550). The conclu- 
sion then follows by an application of general saddle-point bounds (Corollary VIII.1, 
p. 549). | 


Example VYI1.10. Entropy bounds for binomial coefficients. Consider the problem of estimat- 
ing the binomial coefficients ( a for some A with 0 < 4 < 1 and N = An. Proposition VHI.7 
provides 


(",) =E%a¢9"< ator’, 
an 


where tie =A,1.65 C= —- A simple computation then shows that 


(,) < exp(nA(A)), where H(A) = —Alogda — (1 — A) log — 2) 


is the entropy function. Thus, for 2 4 1/2, the binomial coefficients ( a) are exponentially 
smaller than the central coefficient (nj2)> and the entropy function precisely quantifies this 
Exponential Paps: vines oases scl owdare ate ee Ce ROG Ae bhi eeEs oa VERE Sawa eas aa Oa | 


> VIOI.32. Anomalous dice games. The probability of a score equal to An in n casts of an 
—nK 


unbiased die is bounded from above by a quantity of the form e where 
r=¢° 
K = —log6+ log i —(A- 1)logé, 
and ¢ is an algebraic function of A determined by Y3-0 A= pe J=0. <J 


[> VIIL33. Large deviation bounds for sums of random variables. Let g(u) = E(u*) be the 
probability generating function of a discrete random variable X > 0 and let « = g’(1) be the 
corresponding mean (assume 4s < 00). Set N = An and let ¢ be the root of Cg/(C)/g(¢) = 4 
assumed to exist within the domain of analyticity of g. Then, for 2 < mw, one has 


1 
Susu)” < 


a Wana 
k<N img 


Dually, for 2 > yw, one finds 


Deis" s Seere™. 
k>N ¢ 


These are exponential bounds on the probability that n copies of the variable X have a sum 
deviating substantially from the expected value. dq 


VIII. 8.2. Large powers: saddle-point analysis. The saddle-point bounds for 
large powers are technically shallow but useful, whenever only rough order of magni- 
tude estimates are sought. In fact, the full saddle-point method is applicable under the 
very conditions of the preceding proposition. 


Theorem VIII.8 (Saddle-point estimates of large powers). Under the conditions of 
Proposition VII.7, with 2 = N/n, one has 
Bo)" 


N n_ 
(84) ENIAG)- BO" = AM a eal + 0), 
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where ¢ is the unique root of € B'(¢)/B(C) = 4 and 


a 
¢= gee Woe BK) — Alog¢). 


In addition, a full expansion in descending powers of n exists. 
These estimates hold uniformly for 2 in any compact interval of (0, T), i.e., any 
interval [1’, 2] withO < 2! < 2” < T, where T is the spread. 


Proof. We discuss the analysis corresponding to a fixed 4. For any fixed r such that 
0 <r < R, the function |B(re'”)| is, by positivity of coefficients and aperiodicity, 
uniquely maximal at 9 = 0 (see The Daffodil Lemma on p. 266). It is also infinitely 
differentiable at 0. Consequently there exists a (small) angle 6, € (0, z) such that 


|Bire’’)| < |B(re!')| for all 0 € (0), x1], 


and at the same time, |B(re’”)| is strictly decreasing for 0 € [0, 0;] (it is given by a 
Taylor expansion without a linear term). 

We carry out the integration along the saddle-point circle, z = rele: where the 
previous inequalities on |B(z)| hold. The contribution for |@| > 6; is exponentially 
negligible. Thus, up to exponentially small terms, the desired coefficient is given 
asymptotically by J(@)), where 


01 . : ; 
J(01) = a | Ace?) B(ce!? "ec"? aa. 
2a -O; 


It is then possible to impose a second restriction on 0, by introducing 6 according to 
the general heuristic, namely, np, > wo, nOs — 0. We fix here 


0) = O(n) =n). 


By the decrease of |B(¢e!’)| on [4, 1] and by local expansions, the quantity J(@)) — 
J (49) is of the form exp(—cn!/>) for some c > 0, that is, exponentially small. 

Finally, local expansions are valid in the central range since @ tends to 0 asn > 
oo. One finds for z = re? and || < 4, 


A(z)B(z)"z% ~ ABO)"C% exp(—né6?/2). 


Then the usual process applies upon completing the tails, resulting in the stated es- 
timate. A complete expansion in powers of n~!/? is obtained by extending the ex- 
pansion of log B(z) to an arbitrary order (as in the case of Stirling’s formula, p. 557). 
Furthermore, by parity, all the involved integrals of odd order vanish so that the ex- 
pansion turns out to be in powers of 1/n (rather than 1/,/7). a 


Example VMI. Central binomials and trinomials, Motzkin numbers. An automatic applica- 


2) = [z"](1 + z)*”. In the same 


tion of Theorem VIII.8 is to the central binomial coefficient ( a 


way, one gets an estimate of the central trinomial number, 
3n + 1 / 2 


2/an 


Ty = [2] +z+ z7yn satisfies Tn ~ 


The Motzkin numbers count unary—binary trees, 


Mn =[z"]M(z) where =9M=z(1+M+M?). 
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The standard approach is the one seen earlier based on singularity analysis as the implicitly 
defined function M(z) has an algebraic singularity of the /-type, but the Lagrange inversion 
formula provides an equally workable route. It gives 


1 
M = (1 2\n+1 : 
mi = qed te+z) 
which is amenable to saddle-point analysis via Theorem VIIL.8, leading to 
gntl/2 
M, —.. 

‘i Wan 

See below for more on this theme. ......... 0... cece cece eee e nent n eee e ee ee ne enes | 


We have opted for a basic formulation of the theorem with conditions on A and B 
that are not minimal. It is easily recognized that the estimates of Theorem VIII.8 
continue to hold, provided that the function |B(re'®)| attains a unique maximum on 
the positive real axis, whenr € (0, T) is fixed and 0 varies on [—z, w]. Also, in order 
for the statement to hold true, it is only required that the function A(z) does not vanish 
on (0, T), and A(z) or B(z) could then well be allowed to have negative coefficients: 
see Note VIII.36. Finally, if A(C) = 0, then a simple modification of the argument 
still provides precise estimates in this vanishing case; see Note VIII.37 below. 
> VII.34. Middle Stirling numbers. The “middle” Stirling numbers of both kinds satisfy 


! ! 
al] ~ Ann! (1 a o(n-!)) ; oa {"} ~ cy ARn "2 (1 eh on") , 


where A, = 2.45540, Ay = 1.54413, and A,, Ao are expressible in terms of special values of 
the Cayley tree function. Similar estimates hold for [onl and Cond dq 


> VIIL35. Integral points on high-dimensional spheres. Let L(n, a) be the number of lattice 
points (i.e., points with integer coordinates) in n-dimensional space that lie on the sphere of 


radius ./ N, where N = an is assumed to be an integer. Then, 


[0.0] 
2 2 
L(n,a) =[z%]@(z)", where O(Z):= Do 2z™ =142 02”. 
meZ m=1 
Mazo and Odlyzko [431] show that there exist computable constants C, D depending on a, 


such that L(n, a) ~ C n—!/2 D". The number of lattice points inside the sphere can be similarly 
estimated. (Such bounds are useful in coding theory, combinatorial optimization, especially the 
knapsack problem, and cryptography [393, 431].) 


> VIII.36. A function with negative coefficients that is minimal along the positive axis. Take 
B(z) = 1+z—z!9. By design, B(z) has both negative and positive Taylor coefficients. On 
the other hand, |B(re!®)| for fixed r < 1/10 (say) attains its unique maximum at 0 = 0. For 
certain values of N, an estimate of [z’]B(z)" is provided by (84): discuss its validity. J 


> VIIIL37. Coalescence of a saddle-point with roots of the multiplier. Fix ¢ and take a 
multiplier A(z) in Theorem VIIL.8 such that A(¢) = 0, but A’(¢) 4 0. The formula (84) is then 
to be modified as follows: 


Bo)” 
TNH aie + o(1)). 


Higher order cancellations can also be taken into account. dq 


[<4 JA(z) BQ)” = [A'(O) + C4" ] 
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Large powers: saddle-points versus singularity analysis. In general, the La- 
grange inversion formula establishes an exact correspondence between two a priori 
different problems; namely, 


the estimation of coefficients of large order in large powers, and 
the estimation of coefficients of implicitly defined functions. 


In one direction, the Lagrange Inversion Theorem has the capacity of bringing 
the evaluation of coefficients of implicit functions into the orbit of the saddle-point 
method. Indeed, let Y be defined implicitly by Y = zf(Y), where ¢ is analytic at 0 
and aperiodic. One has, by Lagrange, 


n+l | 1 n n+l 
[2° ]¥ (z) = ear Igy, 


which is of the type (84). Then, under the assumption that the equation ¢(r) — t¢’(r) 


has a positive root within the disc of convergence of ¢, a direct application of Theo- 
rem VIII.8 yields 


n ay pom He _ [26@) 
TD age Bey Ne: 


This last estimate is equivalent to the statement of Theorem VII.2 (p. 453) obtained 
there by singularity analysis. (As we know from Chapter VII, this provides the num- 
ber of trees in a simple variety, with ¢ being the degree generating function of the 
variety.) This approach is in a few cases more convenient to work with than singu- 
larity analysis, especially when explicit or uniform upper bounds are required, since 
constructive bounds tend to be more easily obtained on circles than on variable Hankel 
contours (Note VIII.38). 

Conversely, the Lagrange Inversion Theorem makes it possible to approach prob- 
lems relative to large powers by means of singularity analysis of an implicitly defined 
function’. This mode of operation can prove quite useful when there occurs a coales- 
cence between saddle-points and singularities of the integrand (Note VIII.39). 


D> VIII38. An assertion of Ramanujan. In his first letter to Hardy, Ramanujan (1913) an- 
nounced that 


1, is ae ae nl Pai 
= Wa @a Dl a 
where d= —~ + —____., 
3. 135(n+k) 
and k lies between 8/45 and 2/21. Ramanujan’s assertion indeed holds for all n > 1; see [237] 
for a proof based on saddle-points and effective bounds. dq 


> VII.39. Coalescence between a saddle-point and a singularity. The integral in 


1 ie (l+y)" dy 
0 


Me NOES ERY) ire ga ee 


°This is in essence an approach suggested by several sections of the original memoir of Darboux [137, 
§§3-5], in which “Darboux’s method” discussed in Chapter VI was first proposed. It is also of interest to 
note that a Lagrangean change of variables transforms a saddle-point circle into a contour whose geometry 
is of the type used in singularity analysis. 
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Figure VIII.10. The coefficients [z' Je”*, normalized by e~”, when n = 100 is 
fixed and N = 0..200 varies, have a bell-shaped aspect. 


can be treated directly, but this requires a suitable adaptation of the saddle-point method, 
given the coalescence between a saddle-point at 1 [the part without the (1 — y)® factor] and 
a singularity at that same point. Alternatively, it can be subjected to the change of variables 


z= y/(1+y). Then y is defined implicitly by y = z(1 + y)?, so that 


1 l+y dz ik I+y 
Ih= > 1 I [z'] I . 
2in Jo+ (L—y)ite 2rt (lyr 
Since y(z) has a square-root singularity at z = 1/4, the integrand is of type Z—(+4)/2 and 
4 
nw ZS ed? 
rey 


In general, for ¢(y) satisfying the assumptions (relative to B) of Theorem VIII.8, one 
finds, with t : 6(t) — th/(t) = 0), 
| | py)” dy (2)’ ne)? 
~c ; 
2ia Jor (b(t) — A(y))* y" T r(2t4) 


Van der Waerden discuses this problem systematically in [589]. See also Section VIII. 10 below 
for other coalescence situations. dq 


VIII. 8.3. Large powers: Gaussian forms. Saddle-point analysis has conse- 
quences for multivariate asymptotics and it constitutes a direct way of establishing 
that many discrete distributions tend to the Gaussian law in the asymptotic limit. For 
large powers, this property derives painlessly from our earlier developments, espe- 
cially Theorem VIII.8, by means of a “perturbation” analysis. 

First, let us examine a particularly easy problem: How do the coefficients of 
[z’’ Je”* vary as a function of N when n is some large but fixed number? These coef- 
ficients are 


-) 


ce = [2M Jen _ aT 
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By the ratio test, they have a maximum when N * n and are small when N differs 
significantly from n; see Figure VIII.10. The bell-shaped profile is also apparent on 
the figure and is easily verified by elementary real analysis. The situation is then par- 
allel to what is already known of the binomial coefficients on the nth line of Pascal’s 
triangle, corresponding to [z" ](1 + z)" with N varying. 

The asymptotically Gaussian character of coefficients of large powers is actually 
universal among a wide class of analytic functions. We prove this within the frame- 
work of large powers already investigated in Subsection VIII. 8.1 and consider the 
general problem of estimating the coefficients [z ] (A(z) - B(z)") as N varies. In ac- 
cordance with the conditions on p. 586, we postulate the following: (L1): A(z), B(z) 
are analytic at 0, have non-negative coefficients, and are such that B(O) 4 0; (L2): 
B(z) is aperiodic; (L3) The radius of convergence R of B(z) is a minorant of the 
radius of convergence of A(z). We also recall that the spread has been defined as 
T := lim,_, p- x B’(x)/B(x). 

Theorem VIII.9 (Large powers and Gaussian forms). Consider the “large powers” 
coefficients: 


(85) Ce := [2%] (AG) - B()"). 


Assume that the two analytic functions A(z), B(z) satisfy the conditions (Ly), (L2), 
and (L3). Assume also that the radius of convergence of B satisfies R > 1. Define the 
two constants: 

B(l Bs SBC B’(1)\* 

BO 5 PO OBO eos 

B() B(l) B() B() 
Then the coefficients om for fixed n as N varies admit a Gaussian approximation: 
for N = un+x./n, there holds (as n > oo) 


(86) 


1 (n) 1 219,52 = 
87 CY = e* /00) (14 O(n-'/7)), 
on ABO * = 5 fian (14.0071) 


uniformly with respect to x, when x belongs to a finite interval of the real line. 


Proof. We start with a few easy observations that shed light on the global behaviour 
of the coefficients. First, since R > 1, we have the exact summation, 


CO 
> cP =aa)Ba)’, 
N=0 


which explains the normalization factor in the estimate (87). Next, by definition of the 
spread and since R > 1, one has 
B’(1) _ xB'(x) 
= <i = 1 5 
B(1) x>R- B(x) 


given the general property that x B’(x)/B(x) is increasing. Thus, the estimation of 
the coefficients in the range N = yun + O(./n) falls into the orbit of Theorem VIIL8 
which expresses the results of the saddle-point analysis in the case of large powers. 
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Referring to the statement of Theorem VIIL8, the saddle-point equation is 


Bic) BY), x 

BO) BU) Jn’ 
with ¢ a function of x and n. For x ina bounded set, we thus have ¢ ~ 1 asn — oo. It 
then suffices to effect an asymptotic expansion of the quantities ¢, A(¢), B(C), & in the 
saddle-point formula of Equation (84). In other words, the fact that N is close to un 
induces for ¢ a small perturbation with respect to the value 1. With bj := BW) (1), 
one finds mechanically 


coos 1+ 2 = + O(n") 
re re + CO by? Jn 
Bla PO 5, bobo + ae — bi? Paces): 
and so on. The statement follows. |_| 


Take first A(z) = 1. In the particular case when B(z) is the probability generat- 
ing function of a discrete random variable Y, one has B(1) = 1, and the coefficient 
“ = B'(1) is the mean of the distribution. The function B(z)” is then the probability 
generating function (PGF) of a sum of n independent copies of Y. Theorem VIII.9 de- 
scribes a Gaussian approximation of the distribution of the sum near the mean. Such 
an approximation is called a local limit law, where the epithet “local” refers to the fact 
that the estimate applies to the coefficients themselves. (In contrast, an approximation 
of the partial sums of the coefficients by the Gaussian error function is known as a 
central limit law or, sometimes, as an integral limit law.) In the more general case 
in which A(z) is also the PGF of a non-degenerate random variable (i.e., A(z) 4 1), 
similar properties hold and one has: 


Corollary VIII.3 (Local limit law for sums). Let X be a random variable with prob- 
ability generating function (PGF) A(z) and Y,,..., Yn be independent variables with 
PGF B(z), where it is assumed that X and the Y; are supported on Z>o. Assume that 
A(z) and B(z) are analytic in some disc that contains the unit disc in its interior and 
that B(z) is aperiodic. Let the coefficients u,o be as in (86). Then the sum, 


Sp t= X+¥%4+Yo+---+Vn, 
satisfies a local limit law of the Gaussian type: for t in any finite interval, one has 


—1?/2 
e 
P(S; = mtre<nl) = —— (14007). 
( 7 e ) V2an 
Proof. This is just a restatement of Theorem VIII.9, setting x = to and taking into 
account A(1) = B(1) = 1. | 


Gaussian forms for large powers admit many variants. As already pointed out 
in Section VII. 4, the positivity conditions can be greatly relaxed. Furthermore, es- 
timates for partial sums of the coefficients are possible by similar techniques. The 
asymptotic expansions can be extended to any order. Finally, suitable adaptations of 
Theorems VIII.8 and VIIL.9 make it possible to allow x to tend slowly to infinity and 
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manage what is known as a “moderate deviation” regime. We do not pursue these as- 
pects here since we shall develop a more general framework, that of “Quasi-powers” 
in the next chapter. 

> VIII.40. An alternative proof of Corollary VII.3. The saddle-point ¢ is near 1 when N is near 


the centre N ~ wn. It is alternatively possible to recover the C. yD by Cauchy’s formula upon 
integrating along the circle |z| = 1, which is then only an approximate saddle-point contour. 
This convenient variant is often used in the literature, but one needs to take care of linear terms 
in expansions. Its origins go back to Laplace himself in his first proof of the local limit theorem 
(which was expressed however in the language of Fourier series as Cauchy’s theory was yet 
to be born). See Laplace’s treatise Théorie Analytique des Probabilités [402] first published in 
1812 for much fascinating mathematics related to this problem. dq 


VIII. 9. Saddle-points and probability distributions 


Saddle-point methods are useful not only for estimating combinatorial counts, but 
also for extracting probabilistic characteristics of large combinatorial structures. In the 
previous section, we have already encountered the large powers framework, giving rise 
to Gaussian laws. In this section, we further examine the way a saddle-point analysis 
can serve to quantify properties of random structures. 


VIII. 9.1. Moment analyses. Univariate applications of admissibility include 
the analysis of generating functions relative to moments of distributions, which are 
obtained by differentiation and specialization of corresponding multivariate generat- 
ing functions. In the context of saddle-point analyses, the dominant asymptotic form 
of the mean value as well as bounds on the variance usually result, often leading to 
concentration of distribution (convergence in probability) properties. In what follows, 
we focus on the analysis of first moments (see also Subsection VII. 10.1, p. 532, for 
the “moment pumping” method developed in the context of singularity analysis). 

The situation of interest here is that of a counting generating function G(z), cor- 
responding to aclass G, which is amenable to the saddle-point method. A parameter 7 
on G gives rise to a bivariate GF G(z, u), which is a deformation of G(z) when u is 
close to 1. Then the GFs 


OuG (z, U)|y—1 > e°G(z, u) u=l > 


relative to successive (factorial) moments, are in many cases amenable to an analysis 
that closely resembles that of G(z) itself. In this way, moments can be estimated 
asymptotically. 

We illustrate the analysis of moments by two examples: (i) Example VIII.12 pro- 
vides an analysis of the mean number of blocks in a random set partition by bivariate 
generating functions; (ii) Example VIII.13 estimates the mean number of increasing 
subsequences in a random permutation by a direct generating function construction. 
The first example foreshadows the full treatment of the corresponding limit distribu- 
tion in the next chapter (Subsection IX. 8, p. 690). 


Example V¥1.12. Blocks in random set partitions. The function 


G(z, u) = etlee—1) 
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is the bivariate generating function of set partitions, with u marking the number of blocks (or 
parts). We set G(z) = G(z, 1) and define 


M(z) = ate u)| = e* (2-1). 
Ou pt 
Thus, the quantity 
mn _ [2"|M(z) 

&. —[2”]G(z) 
represents the mean number of parts in a random partition of [1 .. 2]. We already know that G(z) 
is admissible and so is M(z) by closure properties. The saddle-point for the coefficient integral 
of G(z) occurs at ¢ such that ¢e$ =n, and it is already known that ¢ = logn—loglogn+o(1). 

It would be possible to analyze M(z) by means of Theorem VIII.4 directly: the analysis 
then involves a saddle-point c # C¢ that is relative to M(z); an estimation of the mean then 
follows, albeit at the expense of some computational effort. It is however more transparent to 
appeal to Proposition VIII.5, p. 567, and analyse the coefficients of M(z) at the saddle-point of 
G(z). 

Let a(r), b(r) and a(r), b(r) be the functions a1 (r), a2(r) of Equation (47), relative to 
G(z) and M(z), respectively: 


logG(z) = e&-1 logM(z) = e&+z-1 
a(r) = re a(r) = re’+r=a(r)+r 
b(r) = (r24+ryje" bir) = (r4nre"+r=bd(r) +r. 


Thus, estimating my by Proposition VIII.5 with the formula taken at r = ¢, one finds 


& Go) s 
while the corresponding estimate for gy is 
on = 80 
6" /2rb(C) 


Given that bc ) ~ b(©) and that ¢ ? is of smaller order than dtc ), one has 


(1+o0()). 


Mn n 
—* = (1+ 0(1)) = —(1 +0()). 
g logn 


n 
A similar computation applies to the second moment of the number of parts which is 
found to be asymptotic to e2© (the computation involves taking a second derivative). Thus, the 
standard deviation of the number of parts is of an order o(e¢) that is smaller than the mean. 
This implies a concentration property for the distribution of the number of parts. 
Proposition VIII.8. The variable Xn equal to the number of parts in a random partition of the 
set[1..n] has expectation 


E{Xn} = ma + o(1)). 


The distribution satisfies a “concentration” property: for any € > 0, one has 
Xx 
P | : n 
E{Xn} 
The calculations are not especially difficult (see Note VIII.41 for the end result) but they re- 
quire care in the manipulation of asymptotic expansions: for instance, Salvy and Shackell [530] 


who “do it right” report that two discrepant estimates (differing by a factor of e—!) had been 
previously published regarding the value of the mean. .......... 20.0... cee cece e eee eee | 


-1 


> «| +0 asn — +00. 
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> VIHI.41. Moments of the number of blocks in set partitions. Let Xj, be the number of blocks 
in arandom partition of n elements. Then, one has 

n nloglogn(1+o(0 n n(2loglogn —1+o0(1 
a tle We ‘ O) vox) = 4, H2log log wy 
logn log n 


which proves concentration. The calculation is best performed in terms of the saddle-point ¢, 
then converted in terms of n. (See Salvy’s étude [529] and the paper [530].) 


E(Xn) 


log” n log? n 


> VIII.42. The shape of random involutions. Consider a random involution of size n, the EGF 


of involutions being e<t<"/2. Then the mean number of 1-cycles and 2-cycles satisfy 
1 1 
E(@ 1-cycles) = /n + O(1), E(# 2-cycles) = ria svn t+ O(1). 
In addition, the corresponding distributions are concentrated. dq 


Example VYII.13. Increasing subsequences in permutations. Given a permutation written in 
linear notation as 0 = 0] --- Gp, an increasing subsequence is a subsequence oj, --- oj, which 
is in increasing order, i.e., ij) <--- < iz, and oj, < ---oj,. The question is: What is the mean 
number of increasing subsequences in a random permutation? 

The problem has a flavour analogous to that of “hidden” patterns in random words, which 
was tackled in Chapter V, p. 315, and indeed similar methods are applicable here. Define a 
tagged permutation as a permutation together with one of its increasing subsequence distin- 
guished. (We also consider the null subsequence as an increasing subsequence.) For instance, 


7|352|641|89 


is a tagged permutation with the increasing subsequence 3 68 that is distinguished. The vertical 
bars are used to identify the tagged elements, but they may also be interpreted as decomposing 
the permutation into sub-permutation fragments. We let T be the class of tagged permutations, 
with T(z) the corresponding EGF, and set T, = n![z”]T(z). The mean number of increasing 
subsequences in a random permutation of size n is clearly t, = T,/n!. 

In order to enumerate T, we let P be the class of all permutations and P* the subclass of 
non-empty permutations. Then, one has, up to isomorphism, 


T =P x«SET(P*), 


since a tagged permutation can be reconstructed from its initial fragment and the set of its 
fragments (by ordering the set according to increasing values of initial elements). This combi- 
natorial argument gives the EGF T(z) as 


1 
T(z) = exp (72). 


The generating function T(z) can be expanded, so that the quantity 7, admits a closed 
form, 


From this, it is possible to analyse T;, asymptotically by means of the Laplace method for sums, 
as was done by Lifschitz and Pittel in [407]. However, analytically, the function T(z) is a 
mere variant of the EGF of fragmented permutations. Saddle-point conditions are again easily 
checked, either directly or via admissibility, to the effect that 

Th eo! /2¢2/n 

n! 2/an'/4 * 

(Compare with the closely related estimate (45) on p. 562.) 


(88) i) = 
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The estimate (88) has the great advantage of providing information about an important and 
much less accessible parameter. Indeed, let A(o) represent the length of the longest increasing 
subsequence in ao. With I(o) the number of increasing subsequences, one has the general 
inequality, 

ge) &-Tie), 
since the number of increasing subsequences of o is at least as large as the number of subse- 
quences contained in the /ongest increasing subsequence. Let now ¢, be the expectation of 1 
over permutations of size n. Then, convexity of the function 2* implies 


2 
(89) 2 <t,  sothat ly < —~VJn(1+0/(1)). 
log 2 
In summary: 


Proposition VIII.9. The mean number of increasing subsequences in a random permutation 
of n elements is asymptotically 


eo l/2p2vn 
2/an1/4 


Accordingly, the expected length of the longest increasing subsequence in a random permutation 
of size n satisfies the inequality 


(+o()). 


2 
ln < —<Vn(1 + o(1)) © 2.89/n. 
log 2 


Note VIII.45 describes an elementary lower bound of the form f, > 5V/n . In fact, around 
1977, Logan and Shepp [411] and, independently, Vershik and Kerov [596] succeeded in estab- 
lishing the much more difficult result 


bn ~ 2/n. 


Their proof is based on a detailed analysis of the profile of a random Young tableau. (The bound 
obtained here by a simple mixture of saddle-point estimates and combinatorial approximations 
at least provides the right order of magnitude.) This has led in turn to attempts at characterizing 
the asymptotic distribution of the length of the longest increasing subsequence. The problem 
remained unsolved for two decades, despite many tangible steps forward. J. Baik, P. A. Deift, 
and K. Johansson [24] eventually obtained a solution, in 1999, by relating longest increasing 
subsequences to eigenvalues of random matrix ensembles (see Note VIII.45 for the end result). 
We regretfully redirect the reader to relevant presentations of the beautiful theory surrounding 
this sensational result, for instance [10, 148]. 2.0.0... ccc ccc cee ee een eee ene | 


> VIII.43. A useful recurrence. A decomposition according to the location of n yields for tn 
the recurrence 


I n—-1 
th = th— _ =1. 
n= trite » tk to =1 
k=0 
Hence T(z) satisfies the ordinary differential equation, 


a-2247rQ=@-2Te, TO=1, 
dz 


which gives rise to the simpler recurrence 


n 
thot = 2tn - Par he to = 0, ty = 2, 


by which ¢, can be computed efficiently in a linear number of operations. dq 
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> VIIL.44. Related combinatorics. The sequence 7, = n!t, starts as 1,2, 7, 34, 209, 1546, and 
is EIS 4002720. The number 7; counts the following equivalent objects: (i) the n x n binary 
matrices with at most one entry 1 in each column; (i/) the partial matchings of the complete 
bipartite graph Ky»; (iii) the injective partial mappings of [1 . . n] to itself. 


> VIII.45. A simple probabilistic lower bound. Elementary probability theory provides a 


simple lower bound on ¢;,. Let X1,..., Xn be independent random variables uniformly dis- 


tributed over [0, 1]. Assume n = m2. Partition [0, 1[ into m subintervals each of the form 


[j —1/m, j/m[ and X1,..., Xn into m blocks, each of the form X(x—)m41, +--+» Xkm- There 


is a probability 1 — ( — m7lyn ~ 1—e7! that block numbered 1 contains an element of 


subinterval numbered 1, block numbered 2 contains an element of subinterval numbered 2, 
and so on. Then, with high probability, at least m/2 of the blocks contain an element in their 
matching subinterval. Consequently, €) > pain, for n large enough. (The factor 1/2 can 
even be improved a little.) The crisp booklet by Steele [556] describes many similar as well as 
more advanced applications to combinatorial optimization. See also the book of Motwani and 
Raghavan [451] for applications to randomized algorithms in computer science. dq 


> VIII.46. The Baik—Deift-Johansson Theorem. Consider the Painlevé II equation 
u(x) = 2u(x)? + xu(x) 


and the particular solution ug(x) that is asymptotic to — Ai(x) as x — +00, with Ai(x) the 
Airy function, which solves y’ — xy = 0. Define the Tracy—Widom distribution (arising in 


random matrix theory) 
CO 
F(t) =exp (/ (x — t)ug(x)* dr) ' 
t 


The distribution of the length of the longest increasing subsequence, / satisfies 
: 1/6 
im P (An <2/n+tnil ) = FG) 


for any fixed t. Thus the discrete random variable 2, converges to a well-characterized distri- 
bution [24]. (An exact formula for associated GFs is due to Gessel; see p. 753.) <i 


VIII. 9.2. Families of generating functions. There is an extreme diversity of 
possible situations, which partly defy classification, when analysing a family of gener- 
ating functions associated with an extremal parameter. Accordingly, we must content 
ourselves with the discussion of a single representative example relative to random 
allocations. (A good rule of thumb is once more that the saddle-point method is likely 
to succeed in cases involving some sort of exponential growth of GFs.) Problems of a 
true multivariate nature will be examined in the next chapter specifically dedicated to 
multivariate asymptotics and limit distributions. 


Random allocations. The example that follows is relative to random allocations, 
occupancy statistics, and balls-in-bin models, as introduced in Subsection IL. 3.2, p. 111. 


Example VYII.14. Capacity in occupancy problems. Assume that n balls are thrown into m 
bins, uniformly at random. How many balls does the most filled bin contain? We shall examine 
the regime n = am for some fixed a in (0, +00); see Example III.10 (p. 177) for a first analysis 
and relations to the Poisson law. The size of the most filled bin is called the capacity and we let 
Cn,m denote the random variable, when all m” allocations are taken equally likely. Under our 
conditions a random bin contains on average a constant number, a, of balls. The proposition 
below proves that the most filled bin has somewhat more, as illustrated by Figure VIII.11. (We 
limit ourselves here to saddle-point bounds. The various regimes of the distribution are well 
covered in [388, pp. 94—115].) 
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Figure VIII.11. Three random allocations of n = 100 balls in m = 100 bins. 


Proposition VIII.10. Let n and m tend simultaneously to infinity, with the constraint that 
n/m = a for some constant a > 0. Then, the expected capacity satisfies 


1 logn 


logn 
5 I + 0(1)) < {Cum} <2" _(1 + 0(1)). 
2 loglogn log logn 


In addition, the probability of capacity to lie outside the interval determined by the lower and 


upper bounds tends to0 as m,n > oo. 


Proof. We detail the proof when a = 1 and abbreviate Cy = Cn,m, the generalization toa # 1 
requiring only simple adjustments. From Chapter II, we know that 


! 
P{Cn <b} = “lz? H(e(e))" 


(90) nt 
ah (e"* — (ep(z))”), 


P{Cy > b} 


where e;(z) is the truncated exponential: 
b . 
y 
z 
e@= >. —- 
joo? 


The two equalities of (90) permit us to bound the left and right tails of the distribution. As 
suggested by the Poisson approximation of balls-in-bins model, we decide to adopt saddle-point 
bounds based on z = 1. This gives (cf Theorem VIII.2, p. 547): 


nie” (ep(l)\" 


P{Ch < b} 


IA 


(91) Gah i 
P{Cy > b} < ae (: 2 (2) ). 
n e 
We set 
(92) pp(n) = (2) 


This quantity represents the probability that n Poisson variables of rate 1 all have value b or less. 
(We know from elementary probability theory that this should be a reasonable approximation of 
the problem at hand.) A weak form of Stirling’s formula, namely, n!e"/n" < 2./xn, forn > 1, 
then yields an alternative version of (91), 


(Cn <b} << 2Vanpp(n) 
P{Ch > b} < 2/an (1 — pp(n)). 

For fixed n, the function p,(n) increases steadily from e~” to 1 as b varies from 0 to oo. 
In particular, the “transition region” where pp(n) stays away from both 0 and 1 is expected to 


(93) 
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play a réle. This suggests defining by = bo(n) such that 
bo! <n< (bo + 1)!, 


so that 


| 
Hgn) Se aot) 
log logn 


We also observe that, as n, b > ov, there holds 


=; A 
a -1 n_ nt ae : 
pp(n) = (e ep(1))" = (: (b+1)! + AG =?) 


ne} O n 
Ee eam 


Left tail. We take b = Lxbo! and a simple computation from (94) shows that for n large 
enough, pp(n) < exp(—</n). Thus, by the first inequality of (93), the probability that the 
capacity be less than 5bo is exponentially small: 


(94) 


1 
(95) P{Cn < xbo(n)} < 2Vanexp(—Vn). 
Right tail. Take b = 2bo. Then, again from (94), for n large enough, one has 1 — pp(n) < 
1 — exp(— 1) = 1a + o(1)). Thus, the probability of observing a capacity that exceeds 2g is 


vanishingly small, and is O(n—!/2), Taking next b = 2by +r with r > 0, similarly gives the 
bound 


(96) P{Cy > 2bo(n) +1r} < 2/= ( 1 ) . 
n \bo(n) 


The analysis of the left and right tails in Equations (95) and (96) now implies 


[0.0] 
1 
E{Cn} <  2boln) + D2) — Bol)" = 2bo(n)U1 + 0(1)) 
r=Q 
oo | }bo(n)] : 
E{Ch} = >) [1-2Vanexp(-J/n)] = 5Po(n)(1 + 0(1)). 
70 
This justifies the claim of the proposition when a = 1. The general case (a 4 1) follows 
similarly from saddle-point bounds taken at z = a. | 


The saddle-point bounds described above are obviously not tight: with some care in deriva- 
tions, one can show by the same means that the distribution is tightly concentrated around its 
mean, itself asymptotic to logn/loglogn. In addition, the saddle-point method may be used 
instead of crude bounds. These results, in the context of longest probe sequences in hashing, 
were obtained by Gonnet [301] under the Poisson model. Many key estimates regarding random 
allocations (including capacity) are to be found in the book by Kolchin ef al. [388]. Analyses 
of this type are also useful in evaluating various dynamic hashing algorithms by means of 
saddle-pomt methods [217,504]: avai eee teasee chaos $402 ORG eae eG daa Gee GS || 


VIII. 10. Multiple saddle-points 


We conclude this chapter with a discussion of higher order saddle-points, accom- 
panied by brief indications on what are known as phase transitions or critical phenom- 
ena in the applied sciences. 
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Multiple saddle-point formula. All the analyses carried out so far have been in 
terms of simple saddle-points, which represent by far the most common situation. In 
order to get a feel of what happens in the case of multiple saddle-points, consider first 
the problem of estimating the two real integrals, 


1 1 
In = (1 — x7)" dx, Jn i (1 — x3)" dx. 
0 0 


(These examples are illustrative: as a check of the results, note that the integrals can 
be evaluated in closed form by way of the Beta function, Note B.10, p. 747.) The con- 
tribution of any interval [xo, 1] is exponentially small, and the ranges to be considered 
on the right of 0 are about n—!/* and n—!/, respectively. One thus sets 


t t 
eae for In, *= "In for Jy. 
Following the guidelines of the method of Laplace (Appendix B.6, p. 755), we proceed 
as follows: local expansions are applied, then tails are completed in the usual way, to 
the effect that 


I : i: ae dt J ; | oe dt 
Ph a, e 5 _Y e * 
. /n 0 " in 0 


The last integrals reduce to the Gamma function integral, which provides 


TQ) T(§) 


1 
In~ 5 yr? ne 3 yl 


1 
2 
The repeated occurrences of 5 in the quadratic case and of 3 in the cubic case stand 
out. The situation in the cubic case corresponds to the Laplace method for integrals, 
when a multiple critical point is present (Note B.23, p. 759). 

What has been just encountered in the case of real integrals is typical of what 
to expect for complex integrals and saddle-points of higher orders, as we now ex- 
plain. First, we briefly revisit the discussion of landscapes of analytic functions at the 
beginning of Section VIII 1, p. 543. Consider, for simplicity, the case of a double 
saddle-point of an analytic function F(z). At such a point ¢, we have F(¢) # 0, 
F'(¢) = F’(¢) = 0, and F’"(¢) # 0. Then, there are three steepest descent lines 
emanating from the saddle-point and three steepest ascent lines. Accordingly, one 
should think of the landscape of |F(z)| as formed of three “valleys” separated by 
three mountains and meeting at the common point ¢. The characteristic aspect is that 
of a “monkey saddle” (comparable to a saddle with places for two legs and a tail) and 
is displayed in Figure VHI.12. 

In order to avoid an unpleasant discussion of the combinatorics of valleys, we 
now discuss the case of a multiple saddle-point estimation of an integral te in the case 
where the starting point A coincides with the saddle-point ¢. By a painless surgery of 
paths, this entails no loss of generality. We can then enunciate a modified form of the 
saddle-point formula of Theorem VIIL3. 
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iN 


Figure VIII.12. A double saddle-point or “monkey saddle”. Left: the surface 
| exp(z3)| around the double saddle point z = 0; right: level curves with arrows 
pointing towards directions of increase. (Inward pointing arrows indicate valleys.) 


Theorem VIII.10 (Double Saddle-point Algorithm). Consider an integral A BR (z) dz, 
where the integrand F = eS is an analytic function depending on a large parameter 
and ¢ is a double saddle-point, which is a root of the saddle-point equations 
£'© =9, £'(O)=9 

(or, equivalently, F'(¢) = F"(¢) = 0). The point B is supposed to lie inside one of 
the three valleys of the double saddle-point. 

Assume that the contour C connecting ¢ to B can be split into C = cM uc 
in such a way that the following conditions are satisfied: (i) the tail integral Jew is 
negligible; (ii) in the central domain C), a cubic approximation holds, 


1 
fO=JO+ ZS OE- cP + O(m),; 


with n, > O0asn > oo uniformly; (iii) tails can be completed back. Then one has 


B fo) 
1 fC 
(98) | ef @ azn~ or ( ) e 
c 3 \3) YF" OP! 
where w is a cube root of unity (w? = 1), dependent upon the position of the valley 
of B. 
Proof. The proof is a simple adaptation of that of Theorem VIII.3. The heart of the 
matter is now the integration of 


1 
| exp (5 roe - 7) dz, 
Cc x 


with C composed of the half-line connecting ¢ to a point at infinity in the valley of 
f"(©)(z — ©)? that contains B. A linear change of variable finally reduces the integral 


r gp 
to the canonical form q ee. |_| 
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> VIII.47. Higher-order saddle-points. For a saddle-point of order p + 1, the saddle-point 
formula reads 


B. fO) 
/ ef az~ Sr (2) E 
¢ P APS Y—f)(O)/P! 
where w? = 1. <q 


> VIII.Q48. Vanishing multipliers and multiple saddle-points. This note supplements Note VIII.47. 


For a saddle-point of order p + 1 and an integrand of the form (z — ¢ yb ef @), the saddle-point 
formula must be modified according to 


00 ; 1 (b+1 1\ G+ D/P 
| xP eax? /P! gy — 1 C= ) (”’) : 
0 P P a 


Thus, the argument of the I’ factor is changed from 1/p to (b + 1)/p, as is the exponent of 
f (P) (¢) and of n in the case of large power estimates. dq 


Forests and coalescence of saddle-points. We give below an application to the 
counting of forests of unrooted trees made of a large number of trees. The analysis 
precisely involves a double saddle-point in a certain critical region. The problem is 
in particular relevant to the analysis of random graphs during the phase where a giant 
component has not yet emerged. 


Example V¥I1.15. Forests of unrooted trees. The problem here consists in determining the 
number Fj, of ordered forests, i.e., sequences, made of m (labelled, non-plane) unrooted trees 
and comprised of 1 nodes in total. The number of unrooted trees of size n is, by virtue of 
Cayley’s formula, n”~? and its EGF is expressed as U = T — T”/2, where T is the Cayley tree 
function satisfying T = ze! Consequently, we have 


1 T2?\" 1 T?\" d 
z z 

eee ce — n T - =) / - 

qo i( ie ) Qin ( r) gil 


The case of interest here is when m and n are linearly related. We thus set m = an, where 
a priori a € (0, 1). Then, the integral representation of Fi, becomes 


1 1 nh, (t) dt t 
(99) — Fin,n = — | er —-ft)—, ha(t) := alog — ~)+t+(a-—1)logt, 
n!} 2ia Jc t 2 


where C encircles 0. This has the form of a “large power” integral. Saddle-points are found as 
usual as zeros of the derivative h’,; there are two of them given by 


69 =2-2a, qQ=l. 


For a < 1/2, one has (g > ¢, while for a > 1/2 the inequality is reversed and (9 < ¢. 
In both cases, a simple saddle-point analysis succeeds, based on the saddle-point nearer to the 
origin; see Note VIII.49 below. In contrast, when a = 1/2, the points ¢g and ¢1 coalesce to 
the common value 1. In this last case, we have hi (1) = ht 20) = 0 while hi (1) = —2 is 
non-zero: there is a double saddle-point at 1. 

The number of forests thus presents two different regimes depending on whether a < 1/2 
or a > 1/2, and there is a discontinuity of the analytic form of the estimates at a = 1/2 
(see Figure VIII.13). The situation is reminiscent of “critical phenomena” and phase transitions 
(e.g., from solid to liquid to gas) in physics, where such discontinuities are encountered. This 
provides a good motivation to study what happens right at the “critical” value a = 1/2. 

As in the analytic proof of the Lagrange Inversion Theorem it proves convenient to adopt 
t = T as an independent variable, so that z = te~’ becomes a dependent variable. Since 
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Figure VIII.13. The function H(a) governing the exponential rate of the number of 
forests exhibits a “phase transition” at a = 1/2 (left); this is reflected by a plot of the 
quantity dlog(Finn/n!), as a function of a = m/n for n = 200 (right). 


dz = (1—1t)e~, this provides the integral representation, a special instance of (99): 


1 1 Ligh” cx dt 
ay hMan = 5 " (-- 5) e =i) aa 


We thus consider the special value a = 1/2 and set h = hj /2. What is to be determined is 


therefore the number of forests of total size n that are made of n/2 trees, assuming naturally n 
even. Bearing in mind that the double saddle-point is at ¢ = ¢) = ¢, = 1, one has 


h(z) =1- ace 13+ 0(z-)*)  @OD. 


Thus, upon neglecting the tails and localizing the integral to a disc centred at 1 with radius 6 = 
0(n) such that 


noe > oo, no > 0 


(5 = n—/!9 is suitable), we have the asymptotic equivalence (with y representing z — 1) 


1 et(1—5 log 2) 


3 
(100) — Finn = — i e")/3y dy + exponentially small, 
Nn. D 


2in 
where D is a certain (small) contour containing 0 obtained by transformation from C. 

The discussion so far has left aside the choice of the contour C in (99), hence of the 
geometric aspect of D near 0, which is needed in order to fully specify (100). Because of the 
minus sign in the third derivative, h’”(1) = —2, the three steepest descent half-lines stemming 
from | have angles 0, e2it/3, e—2it/3. This suggests the adoption, as original contour C in (99), 
of two symmetric segments stemming from 1 connected by a loop left of 0; see Figure VIII.14. 
Elementary calculations justify that the contour can be suitably dimensioned so as to remain 
always below level h(1). See also the right-hand drawing of Figure VIII.14, in which the 
level curves of the valleys below the saddle-point are drawn, together with a legal contour of 
integration that winds about 0. 

Once the original contour of integration has been fixed, the orientation of D in (100) is 
fully determined. After effecting the further change of variables y = wn—"/3 and completing 
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Figure VIII.14. Left: a plot of e! with the double saddle-point at 1. Right: The level 
curves of e” together with a legal integration contour through valleys. 


the tails, we find 

! A n(—4 log 2) | ~y3/3 
(101) — Finn ~ ppg en 4 1082), aa-xe fie yi 
where E connects ooe~!7/3 to 0 then to coe”!*/3, The evaluation of the integral giving A is 
now straightforward (in terms of the Gamma function), which yields the following corollary. 
Proposition VIIIL.11. The number of forests of total size n comprised of n/2 unrooted Cayley 


trees satisfies 


1 1 
— Fnjain~ 293 PT @ (ayer ae FP, 
n! 

The number three is characteristically ubiquitous in the formula. (Furthermore, the formula 
displays the exponent 2/3 instead of 1/3 in the general case (98) because of the additional factor 
(1 — z) present in the integral representation (99), which vanishes at the saddle-point 1; see 
Note VIIL46) ccc curtaunetsceeyideadnaiweriacwunridnaebu nh delete cnadiied dina shwiedie og | 


The problem of analysing random forests composed of a large number of trees has 
been first addressed by the Russian School, most notably Kolchin and Britikov. We 
refer the reader to Kolchin’s book [387, Ch. I] where nearly thirty pages are devoted 
to a deeper study of the number of forests and of associated parameters. Kolchin’s 
approach is however based on an alternative presentation in terms of sums of indepen- 
dent random variables and stable laws of index 3/2, so that it is limited to first order 
asymptotics. As it turns out there is a striking parallel with the analysis of the growth 
of the random graph in the critical region, when the random graph stops resembling a 
large collection of disconnected tree components. 

An almost sure sign of (hidden or explicit) monkey saddles is the presence of 
T(1/3) factors in the final formulae and cube roots in expressions involving n. It is in 
fact possible to go much further than we have done here with the analysis of forests 
(where we have stayed right at the critical point) and provide asymptotic expressions 
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that describe the transition between regimes, here from A'n—!/2, to B"n-2/3, then 
to C’n—!/?, The analysis then appeals to the theory of coalescent saddle-points well 
developed by applied mathematicians (see, e.g., the presentation in [75, 465, 614]) and 
the already evoked réle of the Airy function. We do not pursue this thread further since 
it properly belongs to multivariate asymptotics. It is developed in a detailed manner 
in an article of Banderier, Flajolet, Schaeffer, and Soria [28] relative to the size of the 
core in a random map, on which our presentation of forests has been modelled (see 
also Example IX.42, p. 713). 

The results of several studies conducted towards the end of the previous millen- 
nium do suggest that, among threshold phenomena and phase changes, there is a fair 
amount of universality in descriptions of combinatorial and probabilistic problems by 
means of multiple and coalescing saddle-points. In particular (1/3) factors and the 
Airy function surface recurrently in the works of Flajolet, Janson, Knuth, Luczak and 
Pittel [241, 354], which are relative to the Erdés—Renyi random graph model in its 
critical phase; see also [254] for a partial explanation. The occurrence of the Airy 
area distribution (in the context of certain polygon models related to random walks) 
can be related to this orbit of techniques, as first shown by Prellberg [496], and strong 
numerical evidence evoked in Chapter V (p. 365) suggests that this might extend to 
the difficult problem of self-avoiding walks [509]. Airy-related distributions also ap- 
pear in problems relative to the satisfiability of random boolean expressions [77], the 
path length of trees (Proposition VIL.15, p. 534 and [567, 565, 566]), as well as cost 
functionals of random allocations (Note VII.54, p. 534 and [249]). The reasons are 
sometimes well understood in separate contexts by probabilists, statistical physicists, 
combinatorialist, and analysts, but a global framework is still lacking. 
> VIII.49. Forests and simple saddle-points. When 0 < a < 1/2, the number of forests 
satisfies, for some computable C_ (a): 
eH- (a) 


1 
— Fn,m ~ C_(a) H_(a) = 1-alog2. 
n! 


1/2” 
n 
When 1/2 <a < 1, the number of forests satisfies, for some computable C+ (a): 


1 ef+(a) 
a inn ~ Cz1(a) Te Hy(a) = aloga +2 — 2a + (a — 1) log(2 — 2a). 
! n 


This results from a routine simple saddle-point analysis at ¢, and (9, respectively. dq 


VIII. 11. Perspective 


One of the pillars of classical analysis, the saddle-point method plays a major 
role in analytic combinatorics. It provides an approach to coefficient asymptotics and 
can handle combinatorial classes that are not amenable to singularity analysis. The 
simplest case is that of urns, whose generating function e* has no singularities at a 
finite distance. Similar functions commonly arise as composed SET constructions. 
Broadly speaking, for the class of generating functions that arise from the combinato- 
rial constructions of Part A of this book, singularity analysis is effective for functions 
that have moderate growth at their singularities; the saddle-point method is effective 
otherwise. 
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The essential idea behind the saddle-point method is simple, and it is very easy to 
get good bounds on coefficient growth. In effect, for combinatorial generating func- 
tions, the Cauchy coefficient integral defines a surface with a well-defined saddle-point 
somewhere along the positive real axis, and choosing a circle centred at the origin and 
passing through the saddle-point already provides useful bounds by elementary argu- 
ments. The essence of the full saddle-point method is the development of more precise 
bounds, which are obtained by splitting the contour into two parts and balancing the 
associated errors. 


Combinatorial classes that are amenable to saddle-point analysis have so far only 
been incorporated into relatively few schemas, compared to what we saw for singu- 
larity analysis. The consistency of the approach certainly argues for the existence of 
many more such schemas. A positive signal in that direction is the fact that several 
researchers have developed concepts of admissibility that serve to delineate classes of 
function for which the saddle-point method boils down to verifying simple conditions. 

The saddle-point method also provides insights in more general contexts. Most 
notably, the general results on analysis of large powers lay the groundwork for distri- 
butional analyses and limit laws, which are the subject of the next chapter. 


Bibliographic notes. Saddle-point methods take their sources in applied mathematics, one of 
them being the asymptotic analysis by Debye (1909) of Bessel functions of large order. (In fact, 
there are early signals of its use by Riemann in relation to hypergeometric functions [511] and to 
the zeta function, as noted by Edwards [186, p. 139], as well as traces of it in works of Cauchy 
published in 1827: see the scholarly study by Petrova and Solov’ev [483].) Saddle-point ana- 
lysis is sometimes called steepest descent analysis, especially when integration contours strictly 
coincide with steepest descent paths. Saddle-points themselves are also called critical points 
(i.e., points where a first derivative vanishes). Because of its roots in applied mathematics, the 
method is well covered by the literature in this area, and we refer to the books by Olver [465], 
Henrici [329], or Wong [614] for extensive discussions. A vivid introduction to the subject is to 
be found in De Bruijn’s book [143]. We also recommend Odlyzko’s impressive survey [460]. 

To a large extent, saddle-point methods were introduced into the world of combinatorial 
enumerations in the 1950s. Early combinatorial papers were concerned with permutations (in- 
volutions) or set partitions: this includes works by Moser and Wyman [448, 449, 450] that are 
mostly directed towards entire functions. 

Hayman’s approach [325] which we have expounded here (see also [614]) is notable in its 
generality as it envisions saddle-point analysis in an abstract perspective, which makes it possi- 
ble to develop general closure theorems. A similar thread was followed by Harris and Schoen- 
feld who gave stronger conditions allowing for full asymptotic expansions [323]; Odlyzko and 
Richmond [462] were successful in connecting these conditions with Hayman admissibility. 
Another valuable work is Wyman’s extension to non-positive functions [624]. 

Interestingly enough, developments that parallel the ones in analytic combinatorics have 
taken place in other regions of mathematics. Erwin Schrédinger introduced saddle-point meth- 
ods in his lectures [535] at Dublin in 1944 in order to provide a rigorous foundation to some 
models of statistical physics that closely resemble balls-in-bins models. Daniels’ publica- 
tion [136] of 1954 is a historical source for saddle-point techniques in probability and statistics, 
in which refined versions of the central limit theorem can be obtained. (See for instance the 
description in Greene and Knuth’s book [310].) Since then, the saddle-point method has proved 
a useful tool for deriving Gaussian limiting distributions. We have given here some idea of this 
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approach which is to be developed further in Chapter IX, where we shall discuss some of Can- 
field’s results [101]. Analytic number theory also makes a heavy use of saddle-point analysis. 
In additive number theory, the works by Hardy, Littlewood, and Ramanujan relative to integer 
partitions have been especially influential, see for instance Andrews’ book [14] and Hardy’s 
Lectures on Ramanujan [321] for a fascinating perspective. (In multiplicative number theory, 
generating functions take the form of Dirichlet series while Perron’s formula replaces Cauchy’s 
formula. For saddle-point methods in this context, we refer to Tenenbaum’s book [576] and his 
seminar survey [575].) 

A more global perspective on limit probability distributions and saddle-point techniques 
will be given in the next chapter, since there are strong relations to the quasi-powers framework 
developed there, to local limit laws, and to large deviation estimates. General references for 
some of these aspects of the saddle-point method are the articles of Bender-Richmond [45], 
Canfield [101], Gardy [280, 281, 282], and Gittenberger-Mandlburger [292]. With regard to 
multiple saddle-points and phase transitions, we refer the reader to references provided at the 
end of Section VIII. 10, on p. 605. 
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proposé a un austére janseniste par un homme du monde, 
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Analytic combinatorics concerns itself with the elucidation of properties of combina- 
torial structures in relation to algebraic and analytic properties of generating functions. 
The most basic cases are the enumeration of combinatorial classes and the analysis of 
moments of combinatorial parameters. These involve generating functions in one (for- 
mal or complex) variable as discussed extensively in previous chapters and represent 
essentially univariate problems. 

Many applications, in various sciences as well as in combinatorics itself, require 
quantifying the behaviour of parameters of combinatorial structures. The correspond- 
ing problems are now of a multivariate nature, as one typically wants a way to estimate 
the number of objects in a combinatorial class having a fixed size and a given param- 
eter value. Average-case analyses usually do not suffice, since it is often important to 
predict what is likely to be observed in simulations or on actual data that obey a given 


leg problem relative to games of chance proposed to an austere Jansenist by a man of the world has 
been at the origin of the calculus of probabilities.” Poisson refers here to the fact that questions of betting 
and gambling posed by the Chevalier de Méré (who was both a gambler and a philosopher) led Pascal (an 
austere religious man) to develop some of the first foundations of probability theory. 
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randomness model, in terms of possible deviations from the mean—this signifies that 
information on probability distributions is wanted. Useful but crude estimates are de- 
rived from the moment inequalities developed in Section III. 2.2, p. 161. However, 
much more is usually true. Indeed, it is frequently observed that the histograms of 
the distribution of a combinatorial parameter (for varying size values) exhibit a com- 
mon characteristic “shape”, as the size of the random combinatorial structure tends 
to infinity. In this case, we say that there exists a limit law. Our goal in this chapter 
is precisely to introduce a methodology for distilling limit laws from combinatorial 
specifications. 


In simpler cases, limit laws are discrete and, when this happens, they often turn 
out to be of the geometric or Poisson type. In many other situations, limit laws are 
continuous, a case of prime importance being the Gaussian law associated with the 
famous bell-shaped curve, which is found so often to occur in elementary combinato- 
rial structures. This chapter develops a coherent set of analytic techniques dedicated 
to extracting such discrete and continuous laws by exploiting properties of bivariate 
generating functions. The starting point is provided by symbolic methods of Part A 
(especially Chapter III), which enable us to derive systematically bivariate generat- 
ing functions for many natural parameters of combinatorial structures. The methods 
presented here then combine complex asymptotic techniques of Part B with a small se- 
lection of fundamental theorems from the analytic side of classical probability theory 
recalled in Appendix C (Complements of Probability Theory). 


Under the theory to be expounded, bivariate generating functions are processed 
analytically as follows. The auxiliary variable marking the combinatorial parameter 
of interest is regarded as inducing a deformation of the (univariate) counting gener- 
ating function. The way in which such deformations affect the type of singularity of 
the counting generating functions can then be studied: a perturbation of univariate 
singularity analysis is often sufficient to derive an asymptotic estimate of the proba- 
bility generating function of a given parameter, when taken over objects of some large 
size. Continuity theorems from probability theory finally allow us to conclude on the 
existence of a limit law and characterize it. 


An especially important component of this paradigm is the framework of “quasi- 
powers’. Large powers tend to occur in the asymptotic form of coefficients of count- 
ing generating functions (think of radius of convergence bounds and p~” factors). The 
collection of deformations of a counting generating function is then likely to induce 
for the corresponding coefficients a collection of approximations that also asymptoti- 
cally involve large powers—technically, these are referred to as quasi-powers. From 
this, a Gaussian law is derived along lines that are somewhat reminiscent of the classi- 
cal Central Limit Theorem of probability theory, which expresses the asymptotically 
Gaussian character of sums of independent random variables. 


This chapter starts with an informal introduction to limit laws, either discrete or 
continuous (Section IX. 1). Sections [X.2 and IX.3 then present methods and ex- 
amples relative to discrete laws in combinatorics. Continuous limit laws form the 
subject of Section IX. 4, dedicated to general methodology, and Section IX.5 where 
the quasi-powers framework is introduced. Three sections, IX. 6, IX. 7, and IX. 8, then 
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develop the extension of meromorphic asymptotics, singularity analysis, and saddle- 
point methods to the characterization of Gaussian limit laws in combinatorics. Ad- 
ditional properties, such as local limits and large deviations, form the subject of Sec- 
tions IX. 9 and IX. 10, respectively. The chapter concludes with a discussion of non- 
Gaussian limits (in particular stable laws, Section IX. 11) and multivariate problems 
(Section IX. 12). 

In the business of limit laws in combinatorics, as true elsewhere, the spirit is more 
important than the letter. That is, methods are often more important than theorems, 
whose statements may involve somewhat intricate technical conditions. We have made 
every effort to expound the former in a “conceptual” manner, but shall try our best to 
avoid the latter. 

Within the perspective of analytic combinatorics, the direct relation that can be es- 
tablished between combinatorial specifications and asymptotic properties, in the form 
of limit laws, is striking and is a characteristic feature of the theory. In particular, all 
the schemas previously introduced in this book lead to well-characterized limit laws. 
As we shall see throughout this chapter, almost any basic law of probability theory 
and statistics is likely to occur somewhere in combinatorics; conversely, almost any 
simple combinatorial parameter is likely to be governed by a limit law. 


IX. 1. Limit laws and combinatorial structures 


What is given is a combinatorial class F, labelled or unlabelled, and an integer 
valued combinatorial parameter y. There results both a family of probabilistic models, 
namely for each n the uniform distribution over F,, that assigns to any y € F,, the 
probability 

1 
P(y) = ie with F, =card(Fy), 
n 
and a corresponding family of random variables obtained by restricting y to F,. Un- 
der the uniform distribution over F;,, we then have 


1 
Pp, (x =k) = = card {y Ee Fn | x(y) =k}. 
n 


We write Pz, to indicate the probabilistic model relative to F,, but also freely abbre- 
viate it to P,, or write the probability distribution as P(y, = k), whenever F is clear 
from context. 

As n increases, the histograms of the distribution of y, often share a common 
profile; see Example IX.1 and Figure IX.1 for two elementary parameters, one leading 
to a discrete law, the other to a continuous limit. It is from such observations that the 
notion of a limit law is abstracted. 


Example YX.1. Binary words: elementary approach. Consider the class W of binary words 
over {a,b}. We examine two parameters purposely chosen simple enough, so that explicit 
expressions are available for the probability distributions at stake. Define the parameters 


x (w) := number of initial a in w, €(w) := total number of a in w, 


and the corresponding counts, 


Wry :=card{w € Wy | x(w) = k}, We a := card{w € Wy | €(w) = k}. 
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Figure [X.1. Histograms of probability distributions for the number of initial a in a 
random binary string for n = 10 (y: left) and the total number of a for n = 20 (é: 
right). The histogram corresponding to y is not normalized and direct convergence to 
a discrete geometric law is apparent; for ¢, the horizontal axis is scaled to n, and the 
histogram closely matches the bell-shaped curve that is characteristic of a continuous 
Gaussian limit. 


Explicit expressions result from elementary combinatorics: for 0 < k < n, we have 


Wij=2". we 


n, n,l 


— 9n-2 74 = xX _4. gf _ fn 
SP WE El, Weel = (,). 


The probability distributions are accordingly ([[ - ]] is Iverson’s notation for the indicator func- 
tion): 


1 1 
PwAK=k) = pee LO = k<n]+ pa lk =n), 


Pw, (é = k) = = (;) r 


The probabilities relative to y then resemble, in the asymptotic limit of large n, the geo- 
metric distribution. Indeed, one has, for each k, 


; 1 : 
Pe Pw, a Qk ane RL Py, @ sh=1- ak+1" 


We say that there is a discrete limit law of the geometric type for y. 


In contrast, the parameter ¢ taken over Wy, has mean fn := n/2 and standard deviation 
On = 5/0 . One should then centre and scale the parameter ¢, introducing the “standardized” 


(or “normalized’’) random variable 
() ree En) ne 
WG) Aa 


It then becomes possible to examine the (cumulative) distribution function P(X* < y) for fixed 
values of y. In terms of € itself, we are considering P(€é, < “n+yon) for real values of y. Then, 


the classical approximation of the binomial coefficients yields the approximation (Note IX.1): 


: ’ Lf? Pi 
(2) im, Pn < Hn + yon) = ie dt. 


We now say that there is a continuous limit law of the Gaussian type for €. ............... | 
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> IX.1. Local and central approximations of the binomial law. Equation (2) is classically 
derived by summation from the “local” approximation, 


1 n env /2 y3 
@) nee ‘yva) z JVan/2 nae Jn : 


valid for y = o(n!/ ey; A proof of (3) can be obtained by the method of De Moivre (1721), see 
Note III.3, p. 160, or by Stirling’s formula. dq 

Combinatorial distributions and limit laws. \n accordance with the general no- 
tion of convergence in distribution (or weak convergence, see Appendix C.5: Con- 
vergence in law, p. 776), we shall say that a limit law exists for a parameter if there 
is convergence of the corresponding family of cumulative distribution functions. In 
virtually all cases? encountered in this book, there are, like in Example IX.1, two 
major types of convergence that the a priori discrete distribution of a combinatorial 
parameter may satisfy: 


Discrete —> Discrete and Discrete —> Continuous |. 


Regarding the discrete-to-discrete case, convergence is established without standard- 
izing the random variables involved. In the discrete-to-continuous case, the parameter 
is to be centred at its mean and scaled by its standard deviation, as in (1). 

There is also interest in obtaining a local limit law, which, when available, quan- 
tifies individual probabilities (rather than the cumulative distribution functions). In 
the discrete-to-discrete case, the distinction between local and “global” limits is im- 
material, since the existence of one type of law implies the other. In the discrete-to- 
continuous case, the local limit is expressed in terms of a fixed probability density, as 
in (3), and is technically more demanding to derive, since stronger analytic properties 
are required. 

The speed of convergence in a limit law describes the way the finite combinatorial 
distributions approach their asymptotic limit. It provides useful information on the 
quality of asymptotic approximations for finite n models. 

Finally, quantifying the “risk” of extreme configurations, far away from the mean, 
necessitates estimates on the fails of the distributions. Such estimates belong to the 
theory of large deviation and they constitute a useful complement to the study of 
central and local limits. These various notions are summarized in Figure IX.2. 


Classical probability theory has elaborated highly useful tools for analysing limit 
distributions. For each of the major two types, a continuity theorem provides condi- 
tions under which convergence in law can be established from convergence of trans- 
forms. The transforms in question are probability generating functions (PGFs) for the 
discrete case, characteristic functions or Laplace transforms otherwise. Refinements, 
known as the Berry—Esseen inequalities relate speed of convergence of the combina- 
torial distributions to their limit on the one hand, and a distance between transforms 
on the other. Put otherwise, distributions are close if their transforms are close. Large 
deviation estimates are finally obtained by a technique of “shifting the mean”, which 
is otherwise familiar in probability and statistics. 


2See, however, the case of longest runs in words in Example V.4, p. 308, for a family of discrete 
distributions that need centring. 
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Limit law: An asymptotic approximation of the cumulative distribution function of a combi- 
natorial parameter in terms of the cumulative distribution function of a fixed random variable, 
called the “limit”. Thus one estimates P,(y < k). Centring and scaling, a process called 
standardization, is needed in the case of a continuous limit. 

Local limit law: A direct asymptotic estimate of “local values” of the combinatorial probabili- 
ties, Pn (y =k). In the discrete case, existence of basic and local limits are logically equivalent 
properties. In the continuous case, standardization is needed and the resulting estimate is ex- 
pressed in terms of the density of a fixed continuous random variable. 

Tail estimates and large deviations: For a given distribution, tail estimates are asymptotic 
estimates of the probability of deviating from the mean by a large quantity. Large deviation 
estimates quantify the tail probabilities of a family of distributions, when these decay at an 
exponential rate (in a suitable scale). 

Speed of convergence: An upper bound on the error in asymptotic estimates. 


Figure IX.2. An informal summary of the main notions of relevance to the analysis 
of combinatorial distributions. 


Limit laws and bivariate generating functions. In this chapter, the starting point 
of a distributional analysis is invariably a bivariate generating function 


k 
F@,u) = >> fugue", 

n,k 
where f,,x represents (up to a possible normalization factor) the number of structures 
of size n in some class F. What is sought is asymptotic information relative to the 
array of coefficients 

k 
Jn,k = [2"u" F(Z, u). 

Thus, a double coefficient extraction is to be effected. This task could in principle be 
approached by an iterated use of Cauchy’s coefficient formula, 


1" dz du 
k S 
[z"u IF.) = (=) [ [Peo Sage 


but this approach is hard to carry out? and, under our current stage of knowledge, it 
appears to be less general than the path taken in this chapter. 

Here is a broad outline of the principles behind the theory to be developed in the 
next few sections of this chapter. First, as we know all too well, the specialization at 
u = 1 of F(z, u) gives the counting generating function of F, that is, F(z) = F(z, 1). 
Next, as seen repeatedly starting from Chapter III, the moments of the combinatorial 
distribution { f,,~} for fixed n and varying k are attainable through the partial deriva- 
tives at u = 1, namely 

2 


0 
first moment © —F(z,u) ; second moment <> me (z, u) F 
ou =I ou 
_ u=1 

3 collection of recent works by Pemantle and coauthors [474, 475, 476] shows, however, that a 
well-defined class of bivariate asymptotic problems can be attacked by the theory of functions of several 
complex variables and a detailed study of the geometry of a singular variety. 
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Problem GF u-region Reference 
counting F(z, 1) u=1 Ch. I and II 
r 

moments ee: u) u=1+o(1) Ch. I 

dul u=1 
Discrete laws 
limit law F(z, u) uEeQC {[u| < 1} Th. IX.1, p. 624 
tails F(z, u) jul=r, r>1 Th. IX.3, p. 627 
Continuous laws 
limit law, Gaussian F(z, u) uEeQ, QAcC1EQ Th IX.8, p. 645 
local Limit Law F(z, u) uEeQu {ul = 1} Th. IX.14, p. 696 
large deviations F(z, u) uefl1—d,14+ 0] Th. [X.15, p. 700 


Figure IX.3. A summary of the correspondence between analytic properties of bi- 
variate generating functions (BGFs) and probabilistic properties of combinatorial dis- 
tributions. 


and so on. In summary: Counting is provided by the bivariate generating function 
F(z, u) taken at u = 1; moments result from the bivariate generating function taken 
in an infinitesimal neighbourhood of u = 1. 

Our approach to limit laws will then be as follows. The goal is to estimate the 
“horizontal” generating function 


fru) = > fru’ = [e" F(z, u), 
k 


which is proportional to the probability generating function of x taken over Fy, 
since Ex (u~) = fn(u)/fr(). The problem is viewed as a single coefficient ex- 
traction (extracting the coefficient of z”) but parameterized by u—see our paragraph 
on “singularity perturbation” below for a brief discussion. Thanks to the availability 
of continuity theorems, the following can then be proved for a great many cases of 
combinatorial interest: The existence and the shape of the limit law are derived from 
an asymptotic estimate of fn(u), when u is taken in a fixed neighbourhood of 1, which 
estimate depends on the behaviour of the generating function z+ F(z, u), foru ~ 1. 
This is the basic paradigm of analysis explored throughout most of the chapter. 

In addition, thanks to Berry—Esseen inequalities, the quality of a uniform as- 
ymptotic estimate for f,(u) translates into a speed of convergence estimate for the 
corresponding limit law. Also, for the discrete-to-continuous case, as we shall see 
in Section IX. 9 based on the saddle-point method, local limit laws are derived from 
consideration of the generating function z }» F(z, u), when uw is assigned values on 
the unit circle, |u| = 1. In that case, the secondary inversion, with respect to u, is 
effected by the saddle-point method, rather than by continuity theorems—the princi- 
ples extend the analysis of large powers presented in Section VIII. 8, p. 585. Finally, 
large deviation estimates are found to arise from estimates of f;,(u) when uw is real and 
either u < | (left tail) or u > 1 (right tail), this property being simply a reflection of 
saddle-point bounds; see Section IX. 10. 
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The correspondence between analytic properties of bivariate generating functions 
and probabilistic properties of distributions is summarized in Figure IX.3; see also the 
diagram of Figure [X.9 (p. 649) specialized to continuous limit laws. 


Singularity perturbation. As seen throughout Chapters [V-VIUI, analytic combi- 
natorics approaches the univariate problem of counting objects of size n starting from 
the Cauchy coefficient integral, 


1 d 
KF@ = = | FO. 
Y 


The singularities of F(z) can be exploited, whether they are of a polar type (Chap- 
ters IV and V), algebraic—logarithmic of singularity analysis class (Chapters VI 
and VII) or essential and amenable to the saddle-point method (Chapter VIII). 

From the discussion above, crucial information on combinatorial distributions 
is accessible from the bivariate generating function F(z, u) when u varies in some 
domain containing 1. This suggests to consider F(z, uw) not so much as an analytic 
function of two complex variables, where z and u would play a symmetric réle, but 
rather as a collection of functions of z indexed by a secondary parameter u. In other 
words, F(z, u) is considered as a deformation of F(z) = F(z, 1) when u varies in a 
domain containing u = 1. Cauchy’s coefficient integral gives 


1 d 
fn) = [2"] FG, u) = =| F(Z, u) aaa 
y 


For u = 1, an asymptotic form of f,(1) = [z”] F(z, 1) is obtained by suitable 
contour integration techniques of Part B. We can then examine the way the parame- 
ter u affects the asymptotic coefficient extraction process‘, with the goal of deriving an 
asymptotic estimate of f,(u), when u is close to 1. Such an approach is called a singu- 
larity perturbation analysis. For instance, a singularity of F(z, 1) at z = p typically 
implies for the coefficients of F(z, 1) an estimate of the form f,(1) ¥ p~"n®, and, 
in lucky cases (of which there are many, see Sections IX. 6 and IX. 7), this univariate 
analysis can be extended, resulting in an estimate of the form f,(u) © p(u)~"n". 
Under such circumstances, the probability generating function of the parameter y as- 
sociated to F(z, u) satisfies the estimate 


5 = fa) (ee) 
4 x x 
oe FU) = Fay ~ laa 


This analytical form is reminiscent of the central limit theorem of probability theory, 
according to which large powers of a fixed PGF (corresponding to sums of a large 
number of independent random variables) entail convergence to a Gaussian law>— 
such a law is indeed obtained here. In this chapter, we are going to see numerous 
applications of this strategy, which we now briefly illustrate by revisiting the case of 
binary words from Example IX.1. 


“The essential feature of the analysis of coefficients of GFs by means of complex analytic techniques, 
as developed in Chapters IV—VIII, is to be robust: being based on contour integrals, it is usually amenable 
to smooth perturbations and provides uniform error terms. 

5See also Section VIII. 8, p. 585. 
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Example YX.2._— Binary words: the BGF approach. Regarding binary words and the two 
parameters y (initial run of a’s) and € (total number of a’s), the general strategy of singularity 
perturbation starts from the BGFs, 


: 1 1 
W* = SEQ(ua) SEQ(D SEQ(a)) = W%(z,u) = == 
suzl— Te 
z 1 
WE = SEQ(ua +b a 
Qtu ) (z, u) ree 


and it instantiates as follows. 
Consider the secondary variable u fixed at some value uo. In the case of W%, there are two 
components in the BGF 


1 l—z 
l—upz | 1—2z 


W~ (z, ug) = 


O 


and the dominant singular part, with a simple pole at z = 1/2, arises from the second factor as 
long as |uo| < 2. Accordingly, one has 


1/2 : 1/2 
: W(z) implying = [z” ]W* (z, ug) ~ / ve 


W2 (z, A. gate ae 
G40) aT woh T= uo/2 


The probability generating function of y over Wy, is then obtained upon dividing by 2~”, 


1 1/2 a 
X\ — so" wxX ave oe k 
Ey, (up) =< Qn [|W (z, uo) t= ug/2 = >; gk+1 Uo» 
k=0 
where the last expression is none other than the probability generating function of a discrete law, 
namely, the geometric distribution of parameter 1/2. As we shall see in section IX. 2 where we 
enunciate a continuity theorem for probability generating functions, this is enough to conclude 
that the distribution of y converges to a geometric law. 
In the second case, that of WS, the auxiliary parameter modifies the location of the singu- 
larity, 


z 1 
WS (z, u9) = ————————. 
1- z ad + ug) 
Then, the (unique) singularity smoothly moves, 
1 
ug) = ———~ 
p(uo) (aa 


as ug varies, while the type of singularity (here a simple pole) remains the same—we thus 
encounter an extremely simplified form of (4). Accordingly, the coefficients [z”]WS (z, ug) are 
described by a “large power’ formula (here of an exact type, as in Section VIII. 8, p. 585). As 
regards the probability generating function of € over Wy, one has 


7 on : 2 1 n 
Ew, (u°) = alk WS (z, uo) = (5) . 


In the perspective of the present chapter, this last form (here especially simple) is amenable to 
continuity theorems for integral transforms (Section IX. 4). There results a continuous limit law 
of the Gaussian type in this case. 2.6... ccc cence nent n ene n eens | 


It is typical of the approach taken in this chapter that, once equipped with suitably 
general theorems, it is hardly more difficult to discuss the number of leaves in a non- 
plane unlabelled tree or the number of summands in a composition into primes. 
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F(z,u) foru 1 type of law method and schemas 

Sing. + exp. fixed Discrete limit Subcritical composition §IX.3 
(Neg. bin., Poisson, ...) Subcritical Seq., Set,... §1X.3 

Sing. moves, exp. fixed Gaussian(n, n) Supercritical composition 

— — Meromorphic perturb. SIX. 6 

— _ (Rational functions) SIX. 6 


— — Sing. analysis perturb. SIX.7 
== == (Alg., implicit functions) §IX.7.3 


Sing. fixed, exp. moves Gaussian(logn,logn) (Exp-log structures) SIX. 7.1 
—_ — (Differential eq.) §IX. 7.4 
Sing. + exp. move Gaussian [Gao—Richmond [277]] 

Essential singularity often Gaussian Saddle-point perturb. SIx. 8 
Discontinuous type non-Gaussian Various cases §Ix. 11 
— stable Critical composition SIX. 11.2 


Figure IX.4. A rough typology of bivariate generating functions F(z, u) and limit 
laws studied in this chapter, based on the way singularities and exponents evolve for 
uri. 


The foregoing discussion rightly suggests that a “minor” perturbation of bivariate 
generating function that affects neither the location nor the nature of the singular- 
ity points to a discrete limit law. A “major” change, in location or in exponent, is 
conducive to a continuous limit law, of which the prime example is the normal dis- 
tribution. Figure [X.4 outlines a typology of limit laws summarizing the spirit of this 
chapter: a bivariate generating function F(z, u) is to be analysed; the deformation 
induced by u affects the type of singularity of F(z, u) in various ways, and an adapted 
complex coefficient extraction provides corresponding limit laws. 


IX. 2. Discrete limit laws 


This section provides the basic analytic—probabilistic technology needed for the 
discrete-to-discrete situation, where the distribution of a (discrete) combinatorial pa- 
rameter tends (without normalization) to a discrete limit. The corresponding no- 
tion of convergence is examined in Subsection [X.2.1. Probability generating func- 
tions (PGFs) are important since, by virtue of a continuity theorem stated in Subsec- 
tion IX. 2.2, convergence in distribution is implied by convergence of PGFs. At the 
same time, the fact that PGFs of two distributions are close implies that the origi- 
nal distribution functions are close. Finally, tail estimates for a distribution can be 
easily related to analytic continuation of the PGFs, a basic property discussed in Sub- 
section IX. 2.3. This section organizes some general tools and accordingly we limit 
ourselves to a single combinatorial application, that of the number of cycles of some 
fixed size in a random permutation. The next section will provide a number of appli- 
cations to random combinatorial structures. 

This and the next section feature three classical discrete laws described in Appen- 
dix C.4: Special distributions, p. 774. For our reader’s convenience, their definitions 
are recalled in Figure IX.5, 
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Distribution probabilities PGF 
1 
geometric (q) (1 — q)q* q 
1—qu 
k-1 ees m 
negative binomial[m] (q) ir + aka —qy™ ( q ) 
k 1—qu 
par 
Poisson (A) et eA (l-u) 


kL 


Figure IX.5. The three major discrete laws of analytic combinatorics: the geomet- 
ric, negative binomial, and Poisson laws. 


IX. 2.1. Convergence to a discrete law. In order to specify precisely what a 

limit law is, we base ourselves on the general context described in Appendix C.5: 
Convergence in law, p. 776. The principles presented there provide for what must be 
the “right” notion convergence of a family of discrete distributions to a limit discrete 
distribution. Here is a self-standing definition, particularized to the cases of interest 
here. 
Definition [X.1 (Discrete-to-discrete convergence). The discrete random variables 
X,, supported by Zso are said to converge in law, or converge in distribution, to a 
discrete variable Y supported by Zo, a property written as X, => Y, if, for eachk > 
0, one has 


(5) lim P(X, <k) =P <k). 
noo 
Convergence is said to take place at speed €,, if 


(6) sup [P(X, < kK) -P(Y < I< &, 
k 


The condition in (5) can be expressed in terms of the distribution functions 
Fi,(k) = P(X, < k) and G(k) = PY < k) as 
lim Fy(k) = G(k), 
noo 
pointwise for each k, in which case it is written as F, = G and is known as weak 
convergence. One also says that the X,, (or the F;,) admit a limit law of type Y (or G). 
In addition to limit laws in the sense of (5), there is also interest in examining the 


convergence of individual probability values. One says that there exists a local limit 
law if 


(7) lim P(X, =k) = PY =k), 
noo 
for each k > 0, and 6, is called a local speed of convergence if 
sup |P(Xn =k) — PY =) < &n. 
k 
By differencing or summing, it is easily seen that the conditions (5) and (7) imply one 


another. In other words: For the convergence of discrete random variables (RVs) to 
a discrete RV, there is complete equivalence between the existence of a limit law in 
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the sense of (5) and of a local limit law (7). Note IX.2 below shows elementarily that 
there always exists a speed of convergence that tends to 0 as n tends to infinity. In 
other words, plain convergence of distribution functions or of individual probabilities 
implies uniform convergence. 

In the following, the random variables X,, are meant to represent a combinatorial 
parameter y taken over some class ¥ and restricted to Fy, that is, 


P(X, =k) :=Pr(y =). 


The limit variable Y, i.e., its probability distribution G, is to be determined in each 
particular case. A highly plausible indication of the occurrence of a discrete law is 
the fact that the mean yw, and variance a, of X, remain bounded, i.e., they satisfy 
Hn = O(1) and o? = O(1). Examination of initial entries in the table of values of 
the probabilities will then normally permit one to detect whether a limit law holds. 


Example YX.3. Singleton cycles in permutations. The case of the number of singleton cycles 
(cycles of length 1) in a random permutation of size n illustrates the basic notions, while it can 
be studied with minimal analytic apparatus. The exponential BGF is 


exp(z(u — 1)) 

1-z , 
which determines the mean wz, = 1 (form > 1) and the standard deviation o, = 1 (forn > 2). 
The table of numerical values of the probabilities p, x := [z” u*]P(z, u) immediately tells what 
goes on: 


(8) P =SET(UZ+CYCs2(Z)) => Pu) = 


kK=0 k=1 k=2 k=3 k=4 k=S 
n=4 | 0.375 0.333 0.250 0.000 0.041 

n=5 | 0.366 0.375 0.166 0.083 0.000 0.008 
n= 10 | 0.367 0.367 0.183 0.061 0.015 0.003 
n= 20 | 0.367 0.367 0.183 0.061 0.015 0.003 


The exact distribution is easily extracted from the bivariate GF, 


4 
ze dn—k 
(9) Pak = 2" 1P@W) =k" = 

where n!d, is the number of derangements of size n, that is, 

—Z n 
e (-1)/ 

n= ("1 2 7 

i= 


Asymptotically, one has dy ~ e—!. Thus, for fixed k, we have a local form of a limit law: 
F et 
AU Pak Pi OWNERS “Pe = 
As a consequence: the distribution of the number of singleton cycles in a random permutation 
of large size tends to a Poisson law of rate A = 1. 
Convergence is quite fast. Here is a table of differences, 6,4 = Pn,k — e7! /k: 
k=0 k=1 k=2 k=3 k=4 k=5 
= 10] 2.310-8 = 2510-7 1.210-© = -3.710-8 = 7.310-* ~—1.010-5 
= 20 | 1.810779 -3.9107!9 3.910718 -2.4107!7 1.110716 -3.7107!6 
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The speed of convergence is easily bounded. Indeed, one has dy) = e~! + O(1/n!) by the 
alternating series property, so that, uniformly, 


aa 1 ee 1 (n Ete tags Qe 
Broke = kiin-k!} kl niik)) kl ni) 


As a consequence, one obtains local (6,) and central (€,) speed estimates 


These bounds are quite tight. For instance one computes that the best speed is 659 = 1.5 10-2, 
while the quantity 2” /n! evaluates to 3.7 TOO conte taihnd Sond Meas rote nie ua sRa th | 


> IX.2. Uniform convergence. Local and global convergences to a discrete limit law are always 
uniform. In other words, there always exist speeds €,, d, tending to 0 as n > oo. 

Proof. Set Pn,k := P(Xn = k) and qx := P(Y = k). Assume simply the condition (5) and its 
equivalent form (7). Fix a small € > 0. First dispose of the tails: there exists a kg such that 
Ds dk < €, So that Dek dk > 1 —. Now, by simple convergence, for all large enough 
n > ng, there holds | py x — qx| < €/ko, for each k < ko. Thus, we have kek Pn,k > 1-2e, 
hence >'xs ky Pn,k < 2€. At this stage, we have proved that >’; 4 dk and Di¢sky Pn,k are 
both in [0, 2€]. This shows that convergence of distribution functions is uniform, with speed 
€n < 3e€. Furthermore, a local speed exists, which satisfies d, < 2e. <q 


> IX.3. Speed in local and global estimates. Let My be the spread of y on Fy defined as 
Mn ?= Maxy cF, x(y). Then, a speed of convergence in (6) is given by 


€n = Mnon + ss qk: 
k>My 


(Refinements of these inequalities can be obtained from tail estimates detailed on p. 627.) < 


> IX.4. Total variation distance. The total variation distance between X and Y is classically 


1 
dry(X,Y):= sup |Py(E)—Px(-)|= 5 >a IPY =k)-P(X =k)|. 
ECZ5o k>0 


(Equivalence between the two forms is established elementarily by considering the particular 
E for which the supremum is attained.) The argument of Note IX.2 shows that convergence 
in distribution also implies that the total variation distance between X, and X tends to 0. In 
addition, by Note IX.3, one has dry (Xn, X) < Mnon + ks Mn Dk- <J 


> IX.5. Escape to infinity. The sequence X;,, where 
P{Xn =0}=1/3, P{Xn=1}=1/3, P{Xn =n} =1/3, 


does not satisfy a discrete limit law in the sense above, although limp—yoo P{Xn = k} exists for 
each k. Some of the probability mass escapes to infinity—in a way, convergence takes place in 
ZU {+00}. dq 


IX. 2.2. Continuity theorem for PGFs. A high level approach to discrete limit 
laws in analytic combinatorics is based on asymptotic estimates of the PGF pp (u) of 
a random variable X,, arising from a parameter y over a class C,,. If, for sufficiently 
many values of wu, one has 


Pn) > qtu) (n > +00), 
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one can infer that the coefficients pn, = [u*] Pn(u) (for any fixed k) tend to the limit 
gk = [u*]q(u). A general continuity theorem for PGFs describes precisely the condi- 
tions under which convergence of PGFs to a limit entails convergence of coefficients 
to a limit, that is to say, the occurrence of a discrete limit law. 


Theorem IX.1 (Continuity Theorem, discrete laws). Let Q be an arbitrary set con- 
tained in the unit disc and having at least one accumulation point in the interior of 
the disc. Assume that the probability generating functions py(u) = > ps0 Pn ck and 
qu) = > ps0 qxu* are such that there is convergence, 


lim pr(u) = qu), 
n—-+oo 
pointwise for each u in Q. Then a discrete limit law holds in the sense that, for each k, 


Be Pak = Gk ane ae > Pa,j = 3s qj- 

Jsk isk 
Proof. The p,(u) are a priori analytic in |u| < 1 and uniformly bounded by 1 in 
modulus throughout |u| < 1. Vitali’s Theorem, a classical result of analysis (see [577, 


p. 168] or [329, p. 566]), is as follows: 


Vitali’s theorem. Let F be a family of analytic functions defined in a re- 
gion S (an open connected set) and uniformly bounded on every compact 
subset of S. Let { fn} be a sequence of functions of F that converges ona set 
Q C S having a point of accumulation q € S. Then { fy} converges in all 
of S, uniformly on every compact subset T C S. 


Here, we take S to be the open unit disc on which all the p,(u) are bounded 
(since py(1) = 1). The sequence in question is {p,(u)}. By assumption, there is 
convergence of p,(u) to g(u) on Q. Vitali’s theorem implies that this convergence 
is uniform in any compact subdisc of the unit disc, for instance, |u| < 1/2. Then, 
Cauchy’s coefficient formula provides 


du 
qk = 55 
2in gay uktl d 
(10) ey “Win es meee 
noo Qin ace Pe 
= lim Ppa, 
noo 


where uniformity granted by Vitali’s theorem is combined with continuity of the con- 
tour integral (with respect to the integrand). a 


Feller gives the sufficient set of conditions p,(u) — q(u) pointwise for all real 
u € (0,1), which in our terminology corresponds to the special case Q = (0, 1); 
see [205, p. 280] for a proof that only involves elementary real analysis. It is perhaps 
surprising that very different sets Q can be taken, for instance, 


ae ry ee _ fl _fv-1, 1 
O=|—3.43|> Q= {rh a= {445}. 
The next statement relates a measure of distance between two PGFS, p(u) and 


q(u) to the distance between distributions. It is naturally of interest when quantifying 
speed of convergence to the limit in the discrete-to-discrete case. 
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Theorem IX.2 (Speed of convergence, discrete laws). Consider two random variables 
supported by Zo, with distribution functions F(x), G(x) and probability generating 
functions p(u), qu). 

(i) Assume the existence of first moments. Then, for any T € (0, 2), one has, 


(11) 
+T | (eit) — g(eit 1 

sup |F(k) — GW) < 1/ PENN pg gap: pe ge)|) 

k =, t 2nT T<lt\<x 


(ii) Assume that p(u) and q(u) are analytic in |u| < p, for some p > 1. Then, 
for any r satisfying 1 <r < p, one has 


1 
(12) supe) — GIs | sup lp) —a@)l. 


lul=r 
Proof. (i) Observe first that p(1) = g(1) = 1, so that the integrand is of the form 2 at 
t = 0, corresponding to u = e' = 1. By Appendix C.3: Transforms of distributions, 
p. 772, the existence of first moments, say uw for F and v for G, implies that, for 
small t, one has p(e!’) — q(e!') = (u — v)t + o(t), so that the integral is indeed well 
defined. 

For any given k, Cauchy’s coefficient formula provides 


(13) 
1 os d 1 +a it) _ it . 
F(k) — G(k) = = i A ee ae / A al 
2ix Jy l-u ukt+l On Jia 1-e" 
where y is taken to be the circle |u| = 1, and the trigonometric form results from 


setting u = e'’. (The factor (1 —u)~! sums coefficients.) In the trigonometric integral, 
split the interval of integration according as |t| < T and |t| > T. Fort € [—z,z], 
one has elementarily 
t u 
- Ss 
ef’—]|~ 2 
For |t| < J, this inequality makes it possible to replace |1 — u|~! by 1/|f|, up to a 


constant multiplier and get as a majorant the first term on the right of (11). For |t| > T, 
trivial upper bounds provide the second term on the right of (11). 

(ii) Start from the contour integral in (13), but now integrate along |u| = r. 
Trivial bounds provide (12). | 


The first form holds with strictly minimal assumptions (existence of expecta- 

tions); the second form is a priori only usable for distributions that have exponential 
tails, as discussed in Subsection IX. 2.3 below. The first form relates the distance on 
the unit circle between the PGF p,,(u) of a combinatorial parameter and the limit PGF 
q(u) to the speed of convergence to the limit law—it prefigures the Berry—Esseen 
inequalities discussed in the continuous context on p. 641. 
Example YX.4._— Cycles of length m in permutations. Let us first revisit the number 7 of 
singleton cycles (m = 1) in this new light. The BGF P(z,u) = e®“—)/(1 — z), given by 
Equation (8) in Example IX.3, has for each u a simple pole at z = 1 and is otherwise analytic 
in C \ {1}. Thus, a meromorphic analysis provides instantly, pointwise for any fixed u, 


[z"]P(z,u) =e") + O(R™), 
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Figure IX.6. The PGFs of singleton cycles in random permutations of size n = 
4, 8, 12 (left to right and top to bottom) illustrate convergence to the limit PGF of the 
Poisson(1) distribution (bottom right). The modulus of each PGF is displayed, for 
IR@)I, [S| < 3. 


with any R > 1. This, by the continuity theorem, Theorem IX.1, implies convergence to a limit 
law, which is Poisson. 

Next, in order to obtain a speed of convergence, one should estimate a distance between 
PGFs over the unit circle. One has, for py (u) and q(u), respectively, the PGF of y over P, and 
the PGF of a Poisson variable of parameter 1: 


7 ex(u-L) =e elu) 
Pn(u) — qu) = [2"] ——_—_ 
l-z 
There is aremovable singularity at z = 1. Thus, integration over the circle |z| = 2 in the z-plane 
is permissible, and 
T ecu-1) _ pun dz 
Paki atu) 2in ee 1-z zat” 


Trivial bounds applied to the last integral then yield 


sup Jee-d = ee -O Qn = ul) 


Ipn(u) — q(u)| < 2™ 
|z|=2 


uniformly for u in any compact set of C. One can then apply Theorem IX.2, Part (i). The 
value T = 5 is suitable, to the effect that a speed of convergence to the limit is found to be 
O(2~"). (Any O(R™) is furthermore possible by a similar argument.) Numerical aspects of 
the convergence are illustrated in Figure IX.6. 


IX. 2. DISCRETE LIMIT LAWS 627 


This approach generalizes straightforwardly to the number of m-—cycles in a random per- 
mutation (m kept fixed). The exponential BGF is 


elu-1z"/m 


si ar ea 


Then, singularity analysis of the meromorphic function of z (for u fixed) gives immediately 
: n — ,(u-1)/m 
im lz ]F(z,u) =e : 


The right-hand side of this equality is none other than the PGF of a Poisson law of rate 1 = 1/m. 
The continuity theorem and the first form of the speed of convergence theorem then imply: 
The number of m—cycles in a random permutation of large size converges in law to a Poisson 
distribution of rate 1/m with speed of convergence O(R~") for any R > 1. This last result 
appreciably generalizes our previous observations on singleton cycles. .................4- ‘| 


> IX.6. A quiz. Figure IX.6 tacitly assumes that the property |pn(u)| > |p(u)| suffices to 
conclude that py(u) > p(w). Can you justify it? [Hint: for an analytic function, if we know 
|f(u)|, we know log |¢(u)| = R(log (u)). But then we can reconstruct S(log ¢(u)) by the 
Cauchy-Riemann equations (p. 742). Hence, we know log ¢(u), hence ¢(u) itself. ] J 


> IX.7. Poisson law for rare events. Consider the binomial distribution with PGF (¢ + pu)”. 
If p depends on n in such a way that p = 4/n for some fixed J, then the limit law of the 
binomial random variable is Poisson of rate 2. (This “law of small numbers” explains the 
Poisson character of activity in radioactive decay as well as the occurrence of accidental deaths 
of soldiers in the Prussian army resulting from the kick of a horse [Bortkiewicz, 1898].) <q 


IX. 2.3. Tail estimates. Tail estimates quantify the rate of decrease of probabil- 
ities away from the central part of the distribution. In the case of a discrete limit law 
having a finite mean, what one needs is information regarding P(X > k) as k gets 
large. A simple, but often effective, approach consists in appealing to saddle-point 
bounds. We give here a general statement which is nothing but a rephrasing of such 
bounds adapted to discrete probability distributions. 

Theorem IX.3 (Tail bounds, discrete laws). Let p(u) = E(u*) be a probability 
generating function that is analytic for |u| < r where r is some number satisfying 
r > I. Then, the following “local” and “global” tail bounds hold: 


py) pv) 
P(X =k) < — P(X > k) < ———_.. 
K=Hs"F, PROS a> 
Proof. The local estimate is a direct consequence of trivial bounds applied to Cauchy’s 
integrals, namely 


1 d 
P(X =k) = ia | PO wen < pe) 


2ix r 
The cumulative bound is derived from the useful integral representation 
PX>k = Oy eee eee ee 
= = — — — fea a 
2in ee uo ykt2 
du 


Dit Iyer POW 1)’ 


upon applying again trivial bounds. (Alternatively, summation from the local bounds 
can be used.) a 
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The bounds provided always exhibit a geometric decay in the value of k—this is 
both a strength and a limitation on the method. In accordance with the theorem and as 
is easily checked directly, the geometric and the negative binomial distributions have 
exponential tails; the Poisson law even has a “superexponential” tail, being O(R~*) 
for any R > 1, since its PGF is entire. By their nature, the bounds can also be simul- 
taneously applied to a whole family of probability generating functions, as shown 
by the characteristic example below. Hence their use in obtaining uniform estimates 
in the context of limit laws, in a way that prefigures the study of large deviations 
in Section IX. 10. 


Example 1X.5._ Permutations with a large number of singleton cycles. The problem here is to 
quantify the probability that a permutation of size n has more than k = logn singleton cycles, 
a quantity that is far from the mean value 1. The elementary treatment of Example IX.3 is 
certainly applicable but it has the disadvantage of not easily generalizing to other situations. In 
the perspective of applying Theorem IX.3, we seek instead to bound py (u) for u > 0, where 
Pn(u) = [zJe?™—D/(1 — z), by Equation (8). We have, for u > 0 and any s € (0, 1), 


Pie —s 
Pn(u) = [ze — < et’ So, 
1-z l-s 
as found from saddle-point bounds (in the z—plane) applied to the BGF P(z,u). Taking s = 
1—1/n, which is suggested by the usual scaling of singularity analysis as well as by the saddle- 
point principles, gives the following bound on the PGF, 


Pau) < 2ne", 


valid for alln > 2. (Better estimates are available from the precise analysis of Example IX.4, 
but the improvement regarding tail bounds would be marginal.) Choosing now r = logn in the 
statement of Theorem IX.3 value provides an approximate saddle-point bound, and we get for 
n > 10 (say) 


oe 2n 
ode Pus S Sesion 
j>logn 
Thus the probability of observing more than logn singleton cycles is asymptotically smaller 
than any inverse power of n. Note that, in this example, we have made use of Theorem IX.3, 
while opting to estimate the PGFs plainly by saddle-point bounds taken with respect to the 
principal variable z of the corresponding bivariate generating function. .................. | 


IX. 3. Combinatorial instances of discrete laws 


In this section, we focus our attention on the general analytic schema based on 
compositions (p. 411), and more specifically on its subcritical case (Definition IX.2 
below). It is such that the perturbations induced by the secondary variable (wu) affects 
neither the location nor the nature of the basic singularity involved in the univariate 
counting problem. The limit laws are then of the discrete type. In particular, for 
the labelled universe and for subcritical sequences, sets, and cycles, these limit laws 
are invariably of the negative binomial, Poisson, and geometric type, respectively. 
Additionally, it is easy to describe the profiles of combinatorial objects resulting from 
such subcritical constructions. 
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Subcritical compositions. First, we consider the general composition schema, 
F =Go(uH) = F(z, u) = g(uh(z)). 


This schema expresses over generating functions the combinatorial operation G oH of 
substitution of components 7 enumerated by h(z) inside “templates” G enumerated 
by g(z). (See Chapters I, p. 86 and I, p. 137, for the unlabelled and labelled versions, 
and Chapter III, p. 199, for the bivariate versions.) The variable z marks size as usual, 
and the variable u marks the size of the G—-template. 

We assume globally that g and h have non-negative coefficients and that h(0) = 0 
so that the composition g(h(z)) is well-defined. We let pg and py denote the radii of 
convergence of g and h, and define 
(14) Tg = lim g(x) and tT = lim h(x). 

X> Pg XP, 
The (possibly infinite) limits exist due to the non-negativity of coefficients. As already 
discussed in Section VI. 9, p. 411, three cases are to be distinguished. 


Definition [X.2. The composition schema F (z, u) = g(uh(z)) is said to be subcritical 
if Th < Pg, Critical if t, = pg, Supercritical if tz > pg. 

In terms of singularities, the behaviour of g(h(z)) at its dominant singularity is 
dictated by the dominant singularity of h (subcritical case), or by the dominant singu- 
larity of g (supercritical case), or else it involves a mixture of the two (critical case). 


This section is concerned with the subcritical case®. 


Proposition IX.1 (Subcritical composition, number of components). Consider the 
bivariate composition schema F (z, u) = g(uh(z)). Assume that g(z) and h(z) satisfy 
the subcriticality condition th < pg, and that h(z) has a unique singularity at p = pp 
on its disc of convergence, which, in a A—domain, is of the type 


neyere(t-§) #e((1-3)) 


where t = th, c € R*™,0 <4 < 1. Then, a discrete limit law holds for the number of 
H-components: with fn x i= [c"u*| F(z, u) and fy =(z"|F(z, 1), one has 
k h kggri! 
——=4gk, where qx = ————. 
g'(t) 
The probability generating function of the limit distribution (qx) is 


ug’ (tu) 

g(t) 
Proof. First, we examine the univariate counting problem. Since g(z) is analytic at 7, 
the function g(h(z)) is singular at pz and is analytic ina A—domain. Its singular ex- 
pansion is obtained by composing the regular expansion of g(z) at t with the singular 


qu) = 


6By contrast with the discrete laws encountered here, the case of a supercritical composition leads 
to continuous limit laws of the Gaussian type (Section IX.6). The critical case involves a confluence of 
singularities, which induces stable laws (Section IX. 11). 
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expansion of h(z) at py: 


F(z) = g(h@)) = g(t) — cg’'(r)( — z/p) (1 + 0(1)). 
Thus, F(z) satisfies the conditions of singularity analysis, and 
cg’(t) 
“Tea 
By similar devices, the mean and variance of the distribution are found to be 
each O(1). 

Next, for the bivariate problem, fix any u with, say, u € (0,1). The BGF 
F(z, u) is also seen to be singular at z = p, and its singular expansion obtained 
from F(z, u) = g(uh(z)) by composition, is 
F(Z,u) = g(uh(@)) = g(ut —cu(l —z/p)* + o((1 —2z/p)")) 

= g(ut) — cug'(ut)(1 —z/p)* + o((1 — z/p)’). 
Thus, singularity analysis implies immediately: 
jn RIF u) _ ug'(ut) 
aeo. [2h R (Zl) og’) 
By the continuity theorem for PGFs, this is enough to imply convergence to the dis- 
crete limit law with PGF ug’(ru)/g’(r), and the proposition is established. | 


(15) fr=llFR= p "n*(1 + 0(1)). 


(16) 


What stands out in the statement of Proposition IX.1 is the following general fact: 
In a subcritical composition, the limit law is a direct reflection of the derivative of the 
outer function involved in the composition. 


> IX.8. Tail bounds for subcritical compositions. Under the subcritical composition schema, 
it is also true that the tails have a uniformly geometric decay. Let ug be any number of the 
interval (1, pg/t,). Then the function z+» F(z, uo) is analytic near the origin with a dominant 
singularity at pj, again obtained by composing the regular expansion of g with the singular 


expansion of h, and Equation (16) remains valid at u = uo. There results the asymptotic 
estimate 
si [2"] F(z, uo) Givens 
ug) = —————— _ ~ g' (uotn). 
edema Come 


Thus, for some constant K = K (ug), one has py(ug) < K. It is also easy to verify that py (wu) 
is analytic at wo, so that, by Theorem IX.3, 


-k K(uo) | —k 
Pak < Ko) - U9” > Pnyj © is 1/0 . 
j>k 


Therefore, the combinatorial distributions satisfy, uniformly with respect to n, a tail bound. In 
particular the probability that there are more than a logarithmic number of components satisfies 


(17) Pn(y > logn)=O(n-") and = 0 = loguo. 
Such tail estimates may additionally serve to evaluate the speed of convergence to the limit law 
(as well as the total variation distance) in the subcritical composition schema. J 


> IX.9. Semi-small powers and singularity analysis. Let h(z) satisfy the stronger singular 
expansion 


h(z) =t —c(1—z/p)* + OU —2/p)”, 
for0 <2 <v <1. Then, fork < Clogn (some C > 0), the results of singularity analysis can 
be extended (under the form proved in Chapter VI, they are only valid for fixed k) 


[2"Ue(zyk = kep"n*“! (1+ O74), 
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for some 6; > 0, uniformly with respect to k. [The proof recycles the Hankel contour Chap- 
ter VI, with some care needed in checking uniformity with respect to k; see also p. 709.) < 


> IX.10. Speed of convergence in subcritical compositions. Combining the exponential tail 
estimate (17) and local estimates deriving from the singularity analysis of “semi-small” powers 
in the previous note, one obtains for the distribution functions associated with py x and px the 
speed estimate 


L 
sup | Fn(k) — F(k)| < >" 
k nee 
There, L and 0) are two positive constants. dq 


Subcritical constructions. The functional composition schema encompasses the 
sequence, set, and cycle constructions of the labelled universe. We state the following 
proposition. 

Proposition [X.2 (Subcritical constructions, number of components). Consider the 
labelled constructions of sequence, set, and cycle. Assume the subcriticality condi- 
tions of the previous proposition, namely t < 1, t < ow, t < I, respectively, where 
t is the singular value of h(z). Then, the distribution of the number y of compo- 
nents determined by fink/fn, is such that x — 1 admits a discrete limit law that is of 
type, respectively, negative binomial N B[2], Poisson, and geometric: the limit forms 
dk = ViMp-y00 Pn(y = k) satisfy, respectively, for k > 0, 

k 


S oe E 
Gi = Ws 0) 4 et,, * apse aa Gr ad =e*, 


Proof. It suffices to take for the outer function g in the composition g oh the quantities 


1 
(18) O(w) = ——, E(w) =e”, L(w) = log 
l-—w 1- 
According to Proposition [X.1 and Equation (18) above, the PGF of the discrete limit 
law involves the derivatives 


1 
O'(w) = E'(w) = é’, L'(w) — an 


1 
(1 — w)?’ l-w 
By definition of the classical discrete laws in Figure IX.5, p. 621, it is seen that the 
last two cases precisely give rise to the classical Poisson and geometric law. The first 
case gives rise to the negative binomial law N B[2], or equivalently the sum of two 
independent geometrically distributed random variables. | 


The technical simplicity with which limit laws are extracted is worthy of note. 
Naturally, the statement also covers unlabelled sequences, since translation into GFs 
is the same in both universes. (Other unlabelled constructions usually lead to discrete 
laws, as long as they are subcritical; see Note [X.14 for a particular instance.) Also, 
subcriticality of a composition g o h necessarily entails that zy, is finite (since one has 
Th < Pg < +00, by definition). Primary cases of applications of Proposition [X.2 
are thus in the realm of “‘treelike” structures, for which the GFs remain finite at their 
radius of convergence, as we have learnt in Chapter VII. 

The example that follows illustrates the application of Proposition IX.1 to the 
analysis of root degrees in classical varieties of trees. It is especially interesting to ob- 
serve the way limit laws directly reflect the combinatorial specifications. For instance, 
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the root degree in a large random plane tree (a Catalan tree) is found to obey, in the 
asymptotic limit, a negative binomial (N B[2]) distribution, which, in a precise sense, 
echoes the sequence construction that expresses planarity. For labelled non-plane trees 
(Cayley tree), a Poisson law echoes the set construction attached to non-planarity. 


Example YX.6. Root degrees in trees. Consider first the number of components in a sequence 
(ordered forest) of general Catalan trees. The bivariate OGF is 


1 1 
Ca arr | =; (1-vI-#). 
eet eantay. ‘ 
We have t, = 1/2 < pg = 1, so that the composition schema is subcritical. Thus, for a forest 
of total size n, the number X,, of tree components satisfies 


k 


Fe Os etd eget a 


(k > 1). 


Since a tree is equivalent to a node appended to a forest, this asymptotic estimate also holds for 
the root degree of a general Catalan tree. 

Consider next the number of components in a set (unordered forest) of Cayley trees. The 
bivariate EGF is 

F(z,u)=e"@, hz) = ze", 
We have t, = 1 < pg = +00, again a subcritical composition schema. Thus the number Xj, 
of tree components in a random unordered forest of size n admits the limit distribution 
lim P{X, =k} =e! /(k- DV), (k > 1), 


n—- oo 


a shifted Poisson law of parameter 1; asymptotically, the same property also holds for the root 
degree of a random Cayley tree 

The same method applies more generally to a simple variety of trees V (see Section VII. 3, 
p. 452) with generator ¢, under the condition of the existence of a root t of the characteristic 
equation 6(r) — t¢’(r) = 0 at a point interior to the disc of convergence of ¢. The BGF 
satisfies 


Viz, u) = zh(uV(z)), Viz) =1-yV1-2z/p+ OU —2z/p). 


so that 


VG) pouty =? cal — /zp. 


The PGF of the distribution of root degree is accordingly 


ug (tu) _ yr kp 
io ~ 26" 


k>1 
This limit law was established under its local form in Chapter VII, p. 456, by means of univaraite 
asymptotics; the present example shows the synthetic character of a derivation based on the 
continuity:theorem for PGES. oo. s42 2064 oo2 seth lew esl a be heu Pork hele aul eee teed | 


A further direct application of the continuity of PGFs is the distribution of the 
number of H—components of a fixed size m in a composition G o H with GF g(h(z)), 
again under the subcriticality condition. In the terminology of Chapter II, we are thus 
characterizing the profile of combinatorial objects, as regards components of some 
fixed size. The bivariate GF is then 


F =Go(H\ Hn +uHm) = F(z,u) = g(h(z) + U— Dhmz"), 
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with hy = [z’]h(z). The singular expansion at z = p is 
Fu) = g(t+(u— hp”) —cg'(t + (u— mp) —z/p)") +0 = 2/p)’). 
Thus, the PGF p,(u) for objects of size n satisfies 


(19) ioe g(t+u— Vhmp™) 
n> oo g'(t) 


As before this calculation specializes to the case of sequences, sets, and cycles giving 
a result analogous to Proposition IX.1. 


Proposition [X.3 (Subcritical constructions, number of fixed-size components). Un- 
der the subcriticality conditions of Proposition IX.2, the number of components of a 
fixed size m in a labelled sequence, set, or cycle construction applied to a class with 
GF h(z) admits a discrete limit law. Let hm := [z™]h(z) and let p be the radius of 
convergence of h(z), with t := h(p). For sequences, sets, and cycles, the limit laws 
are, respectively, negative binomial N B[2](a), Poisson(A), and geometric(b), with 
parameters 


ge ee pe 
1—t+hmp™ 1—tt+hmp™ 
Proof. Instantiate (19) with g, one of the three functions of (18). | 


Example 1X.7. Root subtrees of size m. In a Cayley tree, the number of root subtrees of some 


fixed size m has, in the limit, a Poisson distribution, 
= Wu i. mn-le-m 
Pe=e kl’ oe ml 
In a general Catalan tree, the distribution is a negative binomial N B[2] 
2m—1 
Pe = A —a)?(k + Dak Plates = 
(7) . 
m-1 
Generally, for a simple variety of trees under the usual conditions of existence of a solution to 
the characteristic equation, V = z#(V), one finds “en deux coups de cuillére a pot”, 


V(z, u) = ch(V(z) + Vinz™ u — 1) 
V@u) ~~ pblr+ Vm p™ (u —1))— py $'(t + Vnp™ (u — I))V/1 — z/p 
timit PGR: = f(t + Vinp™ (u — 1)) 

$'(t) 


(Notations are the same as in Example IX.6.) ....... 20... c cece cece ene e een eee | 


We shall see later that similar discrete distributions (the Poisson and negative 
binomial law of Proposition IX.3) also arise in critical set constructions of the exp— 
log type (Example IX.23, p. 675), while supercritical sequences lead to Gaussian 
limits (Proposition IX.7, p. 652). Furthermore, given the generality of the methods 
and the analytic diversity of functional compositions, it should be clear that schemas 
leading to discrete limit laws can be listed ad libitum—in essence, conditions are 
that the auxiliary variable u does not affect the location nor the nature of the dominant 
singularity of F(z, u). The notes below provide a small sample of the many extensions 
of the method that are possible. 
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> IX.11. The product schema. Define 
F(z, u) = A(uz)- BO), 


that corresponds to a product construction, F = A x B, with u marking the size of the A- 
component in the product. Assume that the radii of convergence satisfy p4 > ppg and that B(z) 
has a unique dominant singularity of the algebraic—logarithmic type. Then, the size of the A 
component in a random ¥-structure has a discrete limit law with PGF, 


A(pu) 
plu) = 
A(p) 
The proof follows by singularity analysis. (Alternatively, an elementary derivation can be given 
under the weaker requirement that the by = [z”]B(z) satisfy bn41/bn > pi ) dq 


[> IX.12. Bell number distributions. Consider the “set-of-sets” schema 
F= SET(SET31(H)) => F(z, u) = exp(e“) os 1), 


assuming subcriticality. Then the number y of components satisfies asymptotically a “derivative 
Bell” law: " 

1 kSpt r 
lm P =k)= — : Ki=ere tek 
rors at (x ) K ki 
where Sx = k\[zk]e—! is a Bell number. There exist parallel results: for sequence-of-sets, 
involving the surjection numbers; for set-of-sequences involving the fragmented permutation 
numbers. dq 


n 


> IX.13. High levels in Cayley trees. The number of nodes at level 5 (1.e., at distance 5 from 
the root) in a Cayley tree has the nice PGF 


=] ae eo l+u 
—l+e 
d —-l+e 
ease aol +e : 
du 
so that the distribution involves “super-duper-hyper-Bell numbers”. dq 


> IX.14. Root degree in non-plane unlabelled trees. Discrete laws may also arise from an unla- 
belled set construction, but their form is complicated, reflecting the presence of Pélya operators. 
Consider the class of non-plane unlabelled trees (p. 71) 


H=2ZxMSEV(H) => H(z) =zexp|{ >> natch) 
k>1 
The OGF H(z) is of singularity analysis class (Section VII.5, p. 475), and H(z) ~ 1-—yQ—- 
z/p)'/2. Then the distribution with PGF 


k 
u 
q(u) = upexp | >) Hr) 
k>1 


is the limit law of root degree in non-plane unlabelled trees. dq 


Lattice paths. As a last example here, we discuss the length of the longest initial 
run of a’s in random binary words satisfying various types of constraints. This discus- 
sion completes the informal presentation of Section IX. 1, Examples [X.1 and 2. The 
basic combinatorial objects are the set W = {a, b}* of binary words. A word w € W 
can also be viewed as describing a walk in the plane, provided one interprets a and 
b as the vectors (+1,+1) and (+1, —1), respectively. Such walks in turn describe 
fluctuations in coin-tossing games, as described by Feller [205]. 
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LL en 


Ao ers || ON. 


Figure IX.7. Walks, excursions, bridges, and meanders of Dyck type: from left to 
right and top to bottom, random samples of length 50. 


The combinatorial decompositions of Section V.4, p. 318, form the basis of our 
combinatorial treatment. What is especially interesting here is to observe the complete 
chain where a specific constraint leads in succession to a combinatorial decomposi- 
tion, a specific analytic type of BGF, and a local singular structure that is eventually 
reflected by a particular limit law. 


Example YX.8. Initial runs in random walks. We consider here walks in the right half-plane 
that start from the origin and are made of steps a = (1,1), b = (1,-1). According to the 
discussion of Chapter VII (p. 506), one can distinguish four major types of walks (Figure IX.7). 
— Unconstrained walks (W) corresponding to words and freely described by W = 
SEQ(a, b); 
— Dyck paths (D), which always have a non-negative ordinate and end at level 0; the 
closely related class G = Db represents the collection of gambler’s ruin sequences. 
In probability theory, Dyck paths are also referred to as excursions. 
— Bridges (B), which are walks that may have negative ordinates but must finish at 
level 0. 
— Meanders (M), which always have a non-negative altitude and may end at an arbi- 
trary non-negative altitude. 
The parameter y of interest is in all cases the length of the (longest) initial run of a’s. 
First, unconstrained walks obey the decomposition 
W = SEQ(a) SEQ(b SEQ(a)), 
already repeatedly employed. Thus, the BGF is 
1 
l-zil—zd—z)71l 
By singularity analysis of the pole at p = 1/2, the PGF of y on random words of W) satisfies 
1/2 
Pn(u) T—u/2 > 
for all u such that |u| < 2. This asymptotic value of the PGF corresponds to a limit law, which is 
a geometric with parameter 1/2, in agreement with what was found in Examples IX.1 and IX.2. 


Next, consider Dyck paths. Such a path decomposes into “arches” that are themselves 
Dyck paths encapsulated by a pair a, b, namely, 


D = SEQ(aDb), 


W(z,u) = 
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which yields a GF of the Catalan domain, 


1-1 —422 


D(z) = ID 


ie 


In order to extract the initial run of a’s, we observe that a word whose initial a-run is ak con- 


tains k components of the form bD. This corresponds to a decomposition in terms of the first 
traversals of altitudes k —1,..., 1,0, 


D= > a‘(opy* 
k>0 


(a special “first passage decomposition” in the sense of p. 321), illustrated by the following 
diagram: 


Thus, the BGF is 
1 


Pes) = TT uD 


which is an even function of z. In terms of the singular element, 6 = (1 — 4z)t/ 2 one finds 
2u 
~ ——5+ 0(), 
2-u  (2-u)? ae 
as z > 1/4. Thus, the PGF of y on random words of D2, satisfies 


D@'/?, u) = 


u 
P2n(u) ~ G=n2’ 


which is the PGF of a negative binomial N B[2] of parameter 1/2 shifted by 1. (Naturally, in 
this case, explicit expressions for the combinatorial distribution are available, as this counting 
is equivalent to the classical ballot problem.) 

A bridge decomposes into a sequence of arches, either positive or negative, 


B = SEQ(aDb + bDa), 


where D is like D, but with the réles of a and b interchanged. In terms of OGFs, this gives 
1 1 


1-22D(@z) Jj —422 


The set B* of non-empty walks that start with at least one a admits a decomposition similar to 
that of D, 


B(z) = 


Bt(2) =| >) a*o(Do)"! | -B, 
k>1 
since the paths factor uniquely as a D component that hits 0 for the first time followed by a B 


oscillation. Thus, 
+ Ze 
B* (z) = ——— B(z). 
) 1 —z2D(z) ) 
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The remaining cases B~ = B\ B* consist of either the empty word or of a sequence of positive 
or negative arches starting with a negative arch, so that 


2 
= z- D(z) 
B  (z) = 1+ ——,—~. 
- 1 — 2z7D(z) 
The BGF results from these decompositions: 
vs 2 
Zu 2D 
B(z, u) = —S—__B(z) +1 + ey 
1 — z-uD(z) 1 — 2z“D(z) 
Again, the singular expansion is obtained mechanically, 
1 1 
B(z!/?, uw) = ——) =+ O(1), where 6 = (1 — 4z)!/2. 
(2-—u’ oO 


Thus, the PGF of v on random words of 52, satisfies 


P2n(u) ~ ere 


The limit law is now geometric of parameter 1/2. 
A meander decomposes into an initial run a*, a succession of descents with their compan- 
ion (positive) arches in some number ¢ < k, and a succession of ascents with their correspond- 
ing (positive) arches. The computations are similar to the previous cases, more intricate but still 


“automatic”. One finds that 


XY xy? ) 1 1 


men=( Gay - TEST es gua se 


with X = zu, Y = zW,(z), so that 

l-u—2z24+2uz274+(u- V1 —422 
(1 = zu) (1-22-Vv1-427) (2-utuvi—47) 
There are now two singularities at z = +1/2, with singular expansions, 


= —“.___+0(1), Mw = 


so that only the singularity at 1/2 matters asymptotically. Then, we have 


M(z,u) = 2 


M(z, u) + o(1), 


Pn(u) ~ 


’ 


u 
Q-u)? 
and the limit law is a shifted negative binomial N B[2] of parameter 1/2. In summary: 

Proposition IX.4. The length of the initial run of a’s in unconstrained walks and bridges is 


asymptotically distributed as a geometric; in Dyck excursions and meanders it is distributed as 
a negative binomial N B[2}. 


Similar analyses can be applied to walks with a finite set of step types [27]. .......... | 


> YX.15. Left-most branch of a unary—binary (Motzkin) tree. The class of unary—binary trees 
(or Motzkin trees) is defined as the class of unlabelled rooted plane trees where (out)degrees 
of nodes are restricted to the set {0, 1,2}. The parameter equal to the length of the left-most 
branch has a limit law that is a negative binomial N B[2]. Find its parameter. dq 
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IX. 4. Continuous limit laws 


Throughout this chapter, our goal is to quantify sequences of random variables 
X» that arise from an integer-valued combinatorial parameter y defined on a combi- 
natorial class F. It is a fact that, when the mean w, and the standard deviation oy, 
of X,, both tend to infinity as n gets large, then a limit law that is continuous usually 
holds. That limit law arises not directly from the X,, themselves (as was the case for 
discrete-to-discrete convergence in the previous section) but rather from their stan- 
dardized versions: 
Xn — Ln 

On 

In this section, we provide definitions and major theorems needed to deal with such 
a discrete-to-continuous situation’. Our developments largely parallel those of Sec- 
tion IX. 2 relative to the discrete case, with integral transforms serving as the continu- 
ous analogue of probability generating functions. 


Xt = 


IX.4.1. Convergence to a continuous limit. A real random variable Y is in all 
generality specified by its distribution function, 


P{Y <x} = F(x). 


It is said to be continuous if F (x) is continuous (see Appendix C.2: Random variables, 
p. 771). In that case, F(x) has no jump, and there is no single value in the range of 
Y that bears a non-zero probability mass. If in addition F(x) is differentiable, the 
random variable Y is said to have a density, g(x) = F’(x), so that 


x 
P(Y 205) g(x) dx, Plx <Y <x+dx}=g(x)dx. 
—0oo 
A particularly important case for us here is the standard Gaussian or normal N’(0, 1) 
distribution function, 


1 [ 2 
O(x) = — ee” 2 dw, 
@) V2n Joo 


also called the error function (erf), the corresponding density being 


_y2 
gor ie. 


1 
C(x) = O'(x) = 
V 20 
This section and the next ones are relative to the existence of limit laws of the con- 
tinuous type, with Gaussian limits playing a prominent role. The general definitions of 
convergence in law (or in distribution) and of weak convergence (see Appendix C.5: 
Convergence in law, p. 776) instantiate as follows. 


Definition IX.3 (Discrete-to-continuous convergence). Let Y be a continuous random 
variable with distribution function Fy(x). A sequence of random variables Y, with 


Probability theory has elaborated a unified way of dealing with discrete and continuous laws alike, 
as well as with mixed cases; see Appendix C.1: Probability spaces and measure, p. 769. For analytic 
combinatorics, it seems, however, preferable to develop the two branches of the theory in a parallel fashion. 
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distribution functions Fy, (x) is said to converge in distribution to Y if, pointwise, for 
each x, 
lim Fy, (x) = Fy(x). 


n—- oo 
In that case, one writes Y, = Y and Fy, = Fy. Convergence is said to take place with 
speed €, if 
sup | Fy, (x) — Fy (x)| < €y. 
xeR 


The definition does not a priori impose uniform convergence. It is a known fact, 
however, that convergence of distribution functions to a continuous limit is always 
uniform. This uniformity property means that there always exists a speed €,, that tends 
to 0 asn > oo. 


IX. 4.2. Continuity theorems for transforms. Discrete limit laws can be es- 
tablished via convergence of PGFs to a common limit, as asserted by the continuity 
theorem for PGFs, Theorem IX.1, p. 624. In the case of continuous limit laws, one 
has to resort to integral transforms (see Appendix C.3: Transforms of distributions, 
p. 772), whose definitions we now recall. 


— The Laplace transform, also called the moment generating function, Ay (s) 
is defined by 


+00 
Ay(s) = ey = f e* dF (x). 


co 


— The Fourier transform, also called the characteristic function, dy (t) is de- 
fined by 


+00 
dy (t) = ety = f e' dF (x). 


—co 
(Integrals are taken in the sense of Lebesgue-Stieltjes or Riemann-Stieltjes; cf Ap- 
pendix C.1: Probability spaces and measure, p. 769.) 

There are two classical versions of the continuity theorem, one for characteris- 
tic functions, the other for Laplace transforms. Both may be viewed as extensions 
of the continuity theorem for PGFs. Characteristic functions always exist and the 
corresponding continuity theorem gives a necessary and sufficient condition for con- 
vergence of distributions. As they are a universal tool, characteristic functions are 
therefore often favoured in the probabilistic literature. In the context of this book, 
strong analyticity properties go along with combinatorial constructions so that both 
transforms usually exist and both can be put to good use (Figure IX.8). 


Theorem IX.4 (Continuity of integral transforms). Let Y, Y, be random variables 
with Fourier transforms (characteristic functions) }(t), dn(t), and assume that Y has 
a continuous distribution function. A necessary and sufficient condition® for the con- 
vergence in distribution, Y, = Y, is that, pointwise, for each real t, 


slim dn) = oC). 


8The first part of this theorem is also known as Lévy’s continuity theorem for characteristic functions. 
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-2 =1 0 i: 
t 


Figure IX.8. The standardized distribution functions of the binomial law (top), the 
corresponding Fourier transforms (right), and the Laplace transforms (bottom), for 
n = 3,6,9, 12,15. The distribution functions centred around the mean wy, = n/2 
and scaled according to the standard deviation o, = \/n/4 converge to a limit which 


: i ; 1 Pe itp? : 

is the Gaussian error function, ®(x) = ae / e~”"/? dw. Accordingly, the 
2m J—oo 

corresponding Fourier transforms (or characteristic functions) converge to ¢(t) = 


—1? é , ‘ 
eae 2, while the Laplace transforms (or moment generating functions) converge to 


As) =e 2, 


Let Y, Y, be random variables with Laplace transforms 1(s), An(s) that exist in a 
common interval [—so, So], with so > 0. If, pointwise for each real s € [—So, So], 


lim An(s) = A(s), 
n—- oo 
then the Y,, converge in distribution to Y: Y, => Y. 


Proof. See Billingsley’s book [68, Sec. 26] for Fourier transforms and [68, p. 408], 
for Laplace transforms. | 


IX. 4. CONTINUOUS LIMIT LAWS 641 


> IX.16. Laplace transforms need not exists. Let Y, be a mixture of a Gaussian and a Cauchy 


distribution: 
P 1 1 x g-w?/2 d 1 dw 
Yn < x)= - w+ : 
x ) ( :) -co V¥2z al ee 1+ w2 


Then Y, converges in distribution to a standard Gaussian limit Y, although A, (s) only exists for 
R(s) = 0. 

In the discrete case, the continuity theorem for PGFs (Theorem IX.1, 624) even- 
tually relies on continuity of the Cauchy coefficient formula that realizes the inversion 
needed in recovering coefficients from PGFs. In an analogous manner, the continuity 
theorem for integral transforms may be viewed as expressing the continuity of Laplace 
or Fourier inversion in the specific context of probability distribution functions. 

The next theorem, called the Berry—Esseen inequality, is an effective version of 
the Fourier inversion theorem that proves especially useful for characterizing speeds of 
convergence. It bounds in a constructive manner the sup-norm distance between two 
distribution functions in terms of a special metric distance between their characteristic 
functions. Recall that || f|loo := sup, er If (@)|- 


Theorem IX.5 (Berry—Esseen inequality). Let F,G be distribution functions with 
characteristic functions $(t), y (t). Assume that G has a bounded derivative. There 
exist absolute constants c,, c2 such that for any T > 0, 


*T At) —yt G' 
IF Glo ser [ CD) igs les. 
_T t T 
Proof. See Feller [206, p. 538] who gives 
1 24 
cl=—, C2 = 
1 1 
as possible values for the constants. | 


This theorem is typically used with G being the limit distribution function (often 
a Gaussian for which ||G’||oo = (27)7!/*) and F = F,, a distribution that belongs to 
a sequence converging to G. The quantity T may be assigned an arbitrary value; the 
one giving the best bound in a specific application context is then naturally chosen. 


> IX.17. A general version of Berry-Esseen. Let F, G be two distributions functions. Define 
Lévy’s “concentration function”, Og(h) := sup, (G(x +h) — G(x)), for h > 0. There exists 
an absolute constant C such that 

p(t) —y(@) 


1 +T 
IlF — Glloo $ Coaz)+e | 
may ji t 


See Elliott’s book [191, Lemma 1.47] and the article by Stef and Tenenbaum for a discus- 
sion [557]. The latter provides inequalities analogous to Berry—Esseen, but relative to Laplace 
transforms on the real line (distance bounds tend to be much weaker due to the smoothing nature 
of the Laplace transform). dq 


dt. 


Large powers and the central limit theorem. Here is the simplest conceivable 
illustration of how to use the continuity theorem, Theorem IX.4. The unbiased bino- 
mial distribution Bin(m, 1/2) is defined as the distribution of a random variable X, 


with PGF ¥ 
1 u 
Pra) = n(u*") = (5 si 5) , 


2 
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and characteristic function, 
F : 1 \n 
bn(t) = Ete) = pale") = = (1 tel) 


The mean is “4, = n/2 and the variance is oe = n/4. Therefore, the standardized 
variable X* = (Xy, — Mn) /oy has characteristic function 


(20) b(t) = E(eit Xt) = (cosn =) = (cos =) ; 


The asymptotic form is directly found by taking logarithms, and one gets 


bi eee =a ee ser 
Oo =nio —_— — —— eee — 2 = 
fPn e 2n ~ 6n? 2 n}? 


—1?/2 


pointwise, for any fixed tf, as n — oo. Thus, we have ¢7(t) > e ,asn > Oo. 
This establishes convergence to the Gaussian limit. In addition, upon choosing 
T =n'/?, the Berry—Esseen inequalities (Theorem IX.5) show that the speed of con- 
vergence is O(n7!/7), 
> IX.18. De Moivre’s Central Limit Theorem. Characteristic functions extend the normal limit 
law to biased binomial distributions with PGF (p + qu)", where p + q = 1. (Of course, the 
result is also accessible from elementary asymptotic calculus, which constitutes De Moivre’s 
original derivation; see Note [X.1, p. 615.) dq 
The Central Limit Theorem, known as the CLT (the term was coined by Polya 
in 1920, originally because of its “zentralle Rolle” [central rdle] in probability the- 
ory), expresses the asymptotically Gaussian character of sums of random variables. It 
was first discovered? in the particular case of binomial variables by De Moivre. The 
general version is due to Gauss (who, around 1809, had realized from his works on 
geodesy and astronomy the universality of the “Gaussian” law but had only unsatis- 
factory arguments) and to Laplace (in the period 1812-1820). Laplace in particular 
uses Fourier methods and his formulation of the CLT is highly general, although some 
of the precise validity conditions of his arguments only became apparent more than a 
century later. 


Theorem IX.6 (Basic CLT). Let T; be independent random variables supported by 
R with a common distribution of (finite) mean yw and (finite) standard deviation o. Let 
Sy t= T, +-+++ Th. Then the standardized sum S* converges to the standard normal 


distribution, 
Sp — pun 
Sy = —— > N(0, 1). 
n oJ/n ( ) 
Proof. The proof is based on local expansions of characteristic functions, much like 
those in Equations (20) and (21). First, by a general theorem (see the summary in 
Figure B.2, p. 777 and [424, p. 22], for a proof), the existence of the first two moments 


implies that 7, is twice differentiable at 0, so that 
1 
gr, (t) =1+int— SW +0°)* +00), pO: 


For a perspective on historical aspects of CLT, we refer to Hans Fischer’s well-informed mono- 
graph [213]. 
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By shifting, it suffices to consider the case of zero-mean variables (4 = 0). We then 
have, pointwise for each t as n > ov, 


os t ae ae 2 \\" 2/2 
(22) in (=z) ah marr ee >e : 


as in Equations (20) and (21). The conclusion follows from the continuity theorem. 
(This theorem is in virtually any basic book on probability theory, e.g., [206, p. 259] 
or [68, Sec. 27].) a 


It is important to observe what happens if the 7; are discrete and given by their 
common PGF p(u) = pr,(u) (a case otherwise discussed in Subsection VIII. 8.3, 
p. 591, under a different angle). The proof above makes use of characteristic functions, 
that is, we set u = e’’, so that uw = 1 corresponds to t = 0. Since there is a scaling 
of t by 1/,/n in the crucial estimate (22), we only need information on p(u) relatively 
to a small neighbourhood of u = 1. What this discussion brings is the following 
general fact: in establishing continuous limit laws from discrete distributions, it is the 
behaviour near | of the discrete probability generating functions that matters. We are 
going to make abundant use of this observation in the next section. 


> IX.19. Poisson distributions of large parameter. Let Xj be Poisson with rate 4. As / tends to 
infinity, Stirling’s formula provides easily convergence to a Gaussian limit. The error terms can 
then be compared to what the Berry—Esseen bounds provide. (In terms of speed of convergence, 
such Poisson variables of large parameters sometimes yield better approximations to combina- 
torial distributions than the standard Gaussian law; see Hwang’s comprehensive study [341] for 
a general analytic approach.) 


> IX.20. Extensions of the CLT. The central limit theorem in the independent case is the sub- 
ject of Petrov’s comprehensive monographs [481, 482]. There are many extensions of the CLT, 
to variables that are independent but not necessarily identically distributed (the Lindeberg— 
Lyapunov conditions) or variables that are only dependent in some weak sense (mixing con- 
ditions); see the discussion by Billingsley [68, Sec. 27]. In the particular case where the 7's 
are discrete, a stronger “local” form of the Theorem results from the saddle-point method; see 
our earlier discussion in Section VIII 8, p. 585, the classic treatment by Gnedenko and Kol- 
mogorov [294], and extensions in Section IX. 9 below. <q 


IX. 4.3. Tail estimates. Contrary to what happens with characteristic functions 
that are always defined, the mere existence of the Laplace transform of a distribution 
in a non-empty interval containing 0 implies interesting tail properties. We quote here: 


Theorem IX.7 (Exponential tail bounds). Let Y be a random variable such that its 
Laplace transform 2(s) = E(e’”) exists in an interval [—a, b], where —a < 0 < b. 
Then the distribution of Y admits exponential tails, in the sense that, as x > +00, 
there holds 


P(Y < —x) = O(e“%), P(Y > x) = O(e7"*). 


Proof. By symmetry (change Y to —Y), it suffices to establish the right-tail bounds. 
We have, for any s such thatO <5 <b, 


PY>x) = P(e’ > e*) 
_— qT sy eo x7 SY 
(23) = P E > Ic) B(e | 


lA 


(sje **, 
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where the last line results from Markov’s inequality (Appendix A.3: Combinatorial 
probability, p. 727). It then suffices to choose s = b. | 


Like its discrete counterpart, Theorem IX.3, this theorem is technically quite shal- 
low but still useful, since it sets the stage for the ulterior development of large devia- 
tion estimates, in Section IX. 10. 


IX. 5. Quasi-powers and Gaussian limit laws 


The central limit theorem of probability theory admits a fruitful extension in the 
context of analytic combinatorics. As we show in this section, it suffices that the PGF 
of a combinatorial parameter behaves nearly like a large power of a fixed function to 
ensure convergence to a Gaussian limit—this is the guasi-powers framework. We first 
illustrate this point by considering the Stirling cycle distribution. 


Example YX.9._ The Stirling cycle distribution. The number y of cycles in a permutation is 
described by the BGF 


P = SET(uCYC(Z)) => P(z, uw) = exp (« log ; ~) =(1-z)". 


Let X,, be the random variable corresponding to v taken over P;,. The PGF of X;, is 


n+u-1\— uut+iju+2)---“@tn—-1)  Tutn) 
)- n! ~ T@rati1) 


Pau) = ( 


n 


We find for u near 1, 


u-1 
=_ Xn = i 1 a 1 (u-1) ae (;)) 
04) pal) = BO) =F (1+0(5))=75 (e-y*"(140(+)). 


The last estimate results from Stirling’s formula for the Gamma function (or from singularity 
analysis of [z”](1 — z)~“, Chapter VI), with the error term being uniformly O(n7!), provided 
u stays in a small enough neighbourhood of 1, for instance |u — 1| < 1/2. Thus, as n > +00, 


the PGF py (u) approximately equals a large power of e“—!. taken with exponent logn and 
multiplied by the fixed function, Tw)7!. By analogy with the Central Limit Theorem, we may 
reasonably expect a Gaussian law to hold. 

The mean satisfies un = logn+y +o(1) and the standard deviation is on = ,/logn+o(1). 
We then consider the standardized random variable, 


yx — Xn Tb? 
1) ne ’ 
VL 


*, namely ¢%(t) = E (ei* Xn), then inherits the estimate (24) 


where L := logn. 


The characteristic function of X* 


of pn(u): 


enit (LY ?+y L-"?) JE 1 
* — i /VL 2 of 
(Oem (Le 1)) (1 +0 (<)) : 


For fixed t, with L — oo, the logarithm is then found mechanically to satisfy 


e 1/2 
(25) log Op(t) ear +0 (dog n) / ) : 
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so that dX(t) ~ e~! /2. This is sufficient to establish a Gaussian limit law, 


1 % 2 
‘ < = —wyZ 
(26) im P {Xn <lognty + xyTogn| Zz i e dw. 
Proposition IX.5 (Goncharov’s Theorem). The Stirling cycle distribution, P(Xn = k) = 
a [i describing the number of cycles (equivalently, the number of records) in a random per- 
mutation of size n is asymptotically normal. 


This result was obtained by Goncharov as early as 1944 (see [299]), albeit without an error 
term, as his investigations predate the Berry—Esseen inequalities. Our treatment quantifies the 
speed of convergence to the Gaussian limit as O((logn)~!/2), by virtue of Equation (25) and 
Theorem [X35 5 wuccaccks igus sraes ee cheb Sieg es ied beans die big aees saad eA Wak bous | 


The cycle example is characteristic of the occurrence of Gaussian laws in analytic 
combinatorics. What happens is that the approximation (24) by a power with “large” 
exponent £, = logn leads after normalization, to the characteristic function of a 
Gaussian variable, namely e~?/2. From this, the limit distribution (26) results by the 
continuity theorem. This is in fact a very general phenomenon, as demonstrated by 
a theorem of Hsien-Kuei Hwang [337, 340] that we state next and that builds upon 
earlier statements of Bender and Richmond [44]. 

The following notations will prove especially convenient: given a function f (u) 
analytic at u = 1 and assumed to satisfy f(1) 4 0, we set 


(ia) are er AC eG (fe) 
eye fd) fd) DS] | 

The notations m, v suggest their probabilistic counterparts while neatly distinguishing 
between the analytic and probabilistic realms: If f is the PGF of a random variable X, 
then f (1) = 1 and m(/), the mean, coincides with the expectation E(X); the quantity 
v(f) then coincides with the variance V(X). Accordingly, we call m(f) and v(f), 
respectively, the analytic mean and analytic variance of function f. 


(27) m(f) = 


Theorem IX.8 (Quasi-powers Theorem). Let the Xy be non-negative discrete random 
variables (supported by Zs), with probability generating functions p,(u). Assume 
that, uniformly in a fixed complex neighbourhood of u = 1, for sequences fn, Kn > 
+00, there holds 


(28) Pn(u) = A(u) - B(u)yhn (: +0 (-)) 


n 


where A(u), B(u) are analytic atu = 1 and A(1) = B() = 1. Assume finally that 
B(u) satisfies the so-called “variability condition”, 


0(B(u)) = B’(1) + B’(1) — B’(1)* £0. 
Under these conditions, the mean and variance of Xn satisfy 


Bn m(B(u)) + m(A(u)) + O («;') 
Bn 0(B(u)) + 0(A(u)) + O («;") 


Hn = E(Xn) 
V(X) 


(29) 


on 
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The distribution of Xy is, after standardization, asymptotically Gaussian, and the 
1/2 


speed of convergence to the Gaussian limit is O(x;,! + Bn): 
Xn oz i(Xn) 1 1 
(30) P a <x] = 0m +0(24 ji 
V(Xn) Kn V Bn 


where ®(x) is the distribution function of a standard normal, 


O(x) = 0/2 diy, 


1 x 
— e 
a 

This theorem is a direct application of the following lemma, also due to 
Hwang [337, 340], that applies more generally to arbitrary discrete or continuous 
distributions (see also Note [X.22, p. 647), and is thus entirely phrased in terms of 
integral transforms. 


Lemma IX.1 (Quasi-powers, general distributions). Assume that the Laplace trans- 


forms An(s) = E{e’*"} of a sequence of random variables Xp are analytic in a disc 
|s| < p, for some p > 0, and satisfy there an expansion of the form 

1 
(31) An(s) = ePaU tv) (1 +0 (~)) ; 

Kn 


with Bn, Kn 3 +oo asn — +00, and U(s), V(s) analytic in |s| < p. Assume also 
the variability condition, U" (0) ¥ 0. 

Under these assumptions, the mean and variance of X, satisfy 
(Xn) = BrU'(O) + V0) + OK"), 
V(Xn) = BnU" (0) + V"(0) + OK; '). 
The distribution of X* := (Xn — B,U'(0))/VB,U" (©) is asymptotically Gaussian, 
the speed of convergence to the Gaussian limit being ORS + B’”). 


(32) 


Proof. First, we estimate the mean and variance. The variable s is a priori restricted 
to a small neighbourhood of 0. By assumption, the function log 1, (s) is analytic at 0 
and it satisfies 


log An(s) = Ba (s) + V(s) + 0 (—) 


This asymptotic expansion carries over, with the same type of error term, to deriva- 
tives at 0 because of analyticity: this can be checked directly from Cauchy integral 
representations, 


1 ds 
s=0 2in | = aes see 


upon using a small but fixed integration contour y and taking advantage of the basic 
expansion of log 1,,(s). In particular, the mean and variance are seen to satisfy the 
estimates of (32). 

Next, we consider the standardized variable, 


x* = Xp = BnU'(0) 
eal PO): * 


1 d’ 
k! ds" 


log An(s) 


A*(s) = Efe’*n}, 
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We have 
: BrU'(0) Ss ) 
Vv BnU" (0) VBnU" (0) 


Local expansions to third order based on the assumption (31), with 2,,(0) = 1, yield 


F s° Is] + |s 3 1 
G3) log axis) = 5 +o/ a +0(<). 


n 


log A*(s) = s + log An( 


uniformly with respect to s in a disc of radius O(B,! ey: and in particular in any fixed 
neighbourhood of 0. This is enough to conclude as regards convergence in distribution 
to a Gaussian limit, by the continuity theorem of either Laplace transforms (restrict- 
ing s to be real) or of Fourier transforms (taking s = it). 

Finally, the speed of convergence results from the Berry—Esseen inequalities. 
Take T = T, = cB! on where c is taken sufficiently small but non-zero, in such a 
way that the local expansion of 2,,(s) at 0 applies. Then, the expansion (33) instanti- 
ated at s = it entails that 


Tr | * (it) — e-?/2 1 
An i= | nur e dt + 
=f, t Th 
satisfies A, = O(Bn Me + «,'). The statement now follows from the Berry—Esseen 
inequality, Theorem IX.5. a 


Theorem IX.8 under either form (28) or (31) can be read formally as expressing 
the distribution of a (pseudo)random variable 


Z=YotWi + W2+---+ W,,, 


where Yo “corresponds” to eV) (or A(u)) and each W; to e/S) (or B(u)). However, 
there is no a priori requirement that £, should be an integer, nor that eY), eV) be 
Laplace transforms of probability distribution functions (usually they aren’t). In a 
way, the theorem recycles the intuition that underlies the classical proof of the central 
limit theorem and makes use of the analytic machinery behind it. 

It is of particular importance to note that the conditions of Theorem IX.8 and 
Lemma IX.1 are purely local: what is required is local analyticity of the quasi-power 
approximation at u = | for PGFs or, equivalently, s = 0 for Laplace—Fourier trans- 
forms. This important feature ultimately owes to the standardization of random vari- 
ables and the corresponding scaling of transforms that goes along with continuous 
limit laws 
[> IX.21. Mean, variance and cumulants. With the notations of (27), one has also 

daz 
» v0(f)= aoe fey); 
t=0 
the higher order derivatives give rise to quantities known as cumulants. dq 


d 
m(f) = |, los f(e) 


t=0 


[> IX.22. Two equivalent forms of standardization. By simple real analysis, one has also, under 
the assumptions of Lemma IX.1: 


| EO) < x| = O(x)+0 (— fe =) 
VV(Xn) kn V Bn} 
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Thus, main approximations in the convergence to the Gaussian limit are not affected by the way 
standardization is done, either with the exact values of the mean and variance of X, or with 
their first-order asymptotic approximations. The same is true for Theorem IX.8. dq 


> IX.23. Higher moments under quasi-powers conditions. Following Hwang [340], one has 
also, under the conditions of the Quasi-powers Theorem and for each fixed k, 


E(X4) = oy (Bn) + O ( ) 5 w(x) = KN[skJeTYOTVO), 


kn 
Thus, a polynomial wz, of exact degree k, describes the asymptotic form of higher moments. 


(Hint: make use of differentiability properties of asymptotic expansions of analytic functions, 
as in Subsection VI. 10.1, p. 418.) 


Singularity perturbation and Gaussian laws. The main thread of this chapter is 
that of bivariate generating functions. In general, we are given a BGF F(z, u) and aim 
at extracting a limit distribution from it. The quasi-power paradigm in the form (28) 
is what one should look for, when the mean and the standard deviation both tend to 
infinity with the size n of the combinatorial model. 

We proceed heuristically in the following informal discussion, which expands on 
the brief indications of p. 618 relative to singularity perturbation—precise develop- 
ments are given in the next sections. Start from a BGF F(z, w) and consider u as a 
parameter. If a singularity analysis of sorts is applicable to the counting generating 
function F(z, 1), it leads to an approximation, 


fn i C epee: 


where p is the dominant singularity of F(z, 1) and a is related to the critical expo- 
nent of F(z, 1) at p. A similar type of analysis is often applicable to F(z, u) for u 
near |. Then, it is reasonable to hope for an approximation of the coefficients in the 
z-expansion of the bivariate GF, 


fa(u) © Clu)ptuy"n™, 


In this perspective, the corresponding PGF is of the form 


C(u) (ee) ne w)—a (1) 
CQ) \e) , 


The strategy envisioned here is thus a perturbation analysis of singular expansions 
with the auxiliary parameter u being restricted to a small neighbourhood of 1. 
In particular if only the dominant singularity moves with u, we have a rough form 


wu) x oH (oy 
a CH) \pay) ’ 


suggesting a Gaussian law with mean and variance that are both O(n), by the Quasi- 
powers Theorem. If only the exponent varies, then 


Cl) | a(u)—a(l) ee C(u) (era) 2” 
C(I) c(1) 


suggests again a Gaussian law, but with mean and variance that are now both O (log 7). 


Pn(u) © 


Pn(u) © 
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Local limit 


PA Moments 


Region Property 
Counting u=1 Counting 
u=1+o0(1) Moments 


u € V(1) (neighb.) | Central limit law 


Pee ee jul = 1 Local limit law 
Central limit ué[a, p] Large deviations 


Large deviations 
(left) 


Figure IX.9. The correspondence between regions of the u—plane when considering 
a combinatorial BGF F(z, u) and asymptotic properties of combinatorial distribu- 
tions. 


These cases point to the fact that a rather simple perturbation of a univariate ana- 
lysis is likely to yield a limiting Gaussian distribution. Each major coefficient extrac- 
tion method of Chapters [V—VIII then plays a rdle, and the present chapter illustrates 
this important point in the following contexts. 


— Meromorphic analysis for functions with polar singularities (Section IX. 6 
below, based on a perturbation of methods of Chapters IV and V); 

— Singularity analysis for functions with algebraic—logarithmic singularity 
(Section IX.7 below, based on a perturbation of methods of Chapters VI 
and VII); 

— Saddle-point analysis for functions with fast growth at their singularity (Sec- 
tion IX. 8 below, based on a perturbation of methods of Chapters VII). 


In essence, the decomposable character of many elementary combinatorial structures 
is reflected by strong analyticity properties of bivariate GFs that, after perturbation 
analysis, lead, via the Quasi-powers Theorem (Theorem IX.8), to Gaussian laws. The 
coefficient extraction methods being based on contour integration supply the necessary 
uniformity conditions. 

We shall also see that several other properties often supplement the existence of 
Gaussian limit laws in combinatorics: 


— Local limit laws [developed in Section IX. 9, p. 694 below] arise from quasi- 
power approximations, whenever these remain valid for all values of u on 
the unit circle. In that case, it is possible to express the combinatorial prob- 
ability distribution directly in terms of the Gaussian density, by means of 
the saddle-point method (in a form similar to that of Section VIII. 8, p. 585, 
dedicated lo large powers) replacing the Continuity Theorem to effect the 
secondary coefficient extraction in [u«z"]F (z, u). 

— Large deviation estimates [developed in Section IX. 10, p. 699 below] quan- 
tify the probabilities of rare events, away from the mean value. As could be 
anticipated from Subsection IX. 4.3 relative to tail bounds, they are obtained 
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by considering [z” ]F (z, uo) for some value of up away from 1, via what are 
essentially saddle-point bounds applied to [z” ] F(z, uo). 


The correspondence between u—domains and properties of combinatorial distributions 
is summarized in Figure [X.9. The next sections will copiously illustrate this paradigm 
for each of the main complex asymptotic methods of Part B. 


IX. 6. Perturbation of meromorphic asymptotics 


Once equipped with the general Quasi-powers Theorem, Theorem IX.8 (p. 645), 
it becomes possible to proceed and analyse broad classes of analytic schemas, along 
the lines of the principles of singularity perturbation informally presented in the previ- 
ous section. We commence by investigating the effect of the secondary variable u on 
a bivariate generating function, whose univariate restriction F(z, 1) can be subjected 
to a meromorphic analysis (Chapters IV and V), that is, its dominant singularities are 
poles. For basic parameters arising from the constructions examined there, Gaussian 
laws are the rule. 

In what follows, we first examine supercritical compositions and sequences and 
establish the Gaussian character of the number of components. In this way, one gets 
precise information on the profile of supercritical sequences, which greatly refines the 
mean value estimates of Section V.2, p. 293. We next enunciate a powerful state- 
ment widely applicable to meromorphic functions, with typical applications to runs in 
permutations, parallelogram polyominoes, and coin fountains. The section concludes 
with an investigation of the elementary perturbation theory of linear systems, whose 
applications are in the area of paths in graphs, finite automata, and transfer matrix 
models (Sections V.5 and V. 6). 

This section is largely based on works of Bender who, starting with his seminal 
article [35], was the first to propose abstract analytic schemas leading to Gaussian laws 
in analytic combinatorics. Our presentation also relies on subsequent works of Ben- 
der, Flajolet, Hwang, Richmond, and Soria [44, 258, 260, 337, 338, 339, 340, 547]. 
The essential philosophy here is that (almost) any univariate problem studied in Chap- 
ter V relative to rational and meromorphic asymptotics is susceptible to singularity 
perturbation, to the effect that limit Gaussian laws hold for basic parameters. 


Supercritical compositions and sequences. Our first application of the quasi- 
powers framework is to supercritical compositions (p. 411), whenever the outer func- 
tion has a dominant pole. This covers in particular supercritical sequences, for which 
asymptotic enumeration and moments have been worked out in Section V. 2, p. 293. 
In this way, we get access to distributions arising in surjections, alignments, and com- 
positions of various sorts. Our reader is encouraged to study the proof that follows, 
since it constitutes the technically simplest, yet characteristic, instance of a singularity 
perturbation process. 


Proposition [X.6 (Supercritical compositions). Consider the bivariate composition 
schema F(z,u) = g(uh(z)). Assume that g(z) and h(z) satisfy the supercriticality 
condition ty > fg, that g is analytic in |z| < R for some R > pg, with a unique 
dominant singularity at pg, which is a simple pole, and that h is aperiodic. Then the 
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number y of H-components in a random F,—-structure, corresponding to the proba- 
bility distribution [u‘ z"|F (z, u)/[z" |F (z, 1) has a mean and variance that are asymp- 
totically proportional to n; after standardization, the parameter y satisfies a limiting 
Gaussian distribution, with speed of convergence O(1/./n). 


Proof. We start as usual with univariate analyses. Let p be such that h(p) = pg 
with 0 < p < py. (Existence and unicity of » are guaranteed by the supercriticality 
condition.) The expansions, 


C 1 
8) = r_ FP TOW, HG) = pe FH) — p) + 5h" HIE —py te, 
& 


z/ 
result from the hypotheses. Clearly, F(z) = F(z, 1) has a simple pole at z = p and, 
by composition of the expansions of g and h: 


Cg 
ph'(p)1—z/p 
Aperiodicity of h also implies that p is the unique dominant singularity of F(z, 1). 
The usual process of meromorphic coefficient analysis then provides 


Cg 
ph'(p) 
where o(1) represents an exponentially small error term. Moments can be obtained 
by differentiation, to the effect that the GF associated to the moment of order r has 
a pole of order (r + 1) and is amenable to singularity analysis. (This mimics the 
univariate analysis of supercritical compositions in Section V.2, p. 293.) However, 
moment estimates also result from subsequent developments, so that this phase of the 
analysis can be bypassed. 

Now comes the singularity perturbation process. In what follows, we repeatedly 
restrict u to a sufficiently small neighbourhood of 1. The equation in p(w), 


F(z)= + O(1). 


[c"]F(z) = p "(1+o0()), 


uh(p(u)) = pe 

admits a unique root near p, when u is sufficiently close to 1, and by the analytic 
inversion lemma (Lemma IV.2, p. 275), the function p(u) is analytic at u = 1. The 
function z }» F(z, u) then has a simple pole at z = p(u), and, by composition of 
expansions, we obtain: 

Cig 1 
up(u)h'(p(u)) 1 — z/p(u) 

Next, for u again close enough to 1, we claim that the function z H F(z, u) 
admits p(w) as unique dominant singularity. The proof of this fact depends on the 
aperiodicity of h(z), which grants us the inequality |h(z)| < h(p) = pg for |z| = p, 
z # p; also, for z near p, the equation h(z) = pg admits locally a unique solution, as 
already seen above. Thus, there exists a quantity r > p such that the equation h(z) = 
p admits in |z| < r the unique solution z = p. But then, by keeping u close enough 
to 1, one can find S with p < S <r, such that, in |z| < S, the unique solution to 
the equation uh(z) = pg is p(u) (see the continuity argument used in the proof of the 
Analytic Inversion Theorem of Appendix B.5: Implicit Function Theorem, p. 753). 


(34) F(z,u) ~ (<> plu)). 
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We can now conclude. Let us take S as in the previous paragraph and restrict u 
to a suitably small complex neighbourhood of 1, as the need arises. We then revisit 
the proof by contour integration of coefficient extraction in meromorphic functions, 
Theorem IV.10, p. 258. We have, by residues, 


i dz n —n-1 
= Fu) =e" |F@, u) + Res(g(uh(z))z""", z = p(u)), 

2in |z|=S git 
and, since F(z, u) = g(uh(z)) is analytic, hence uniformly bounded, for |z| = S, we 
get via (34) the main uniform estimate 

Cpg 
up(u)h'(p(u))’ 
for some K > 1. Thus, the PGF of y over F,, which is pr(u) = 
[2"] F(z, u)/[z" | F(z, 1) satisfies 


[2"]F(z,u) = Clu): p(u)~" (1+ O(K™")), C(u) := 


a C(u) pd) 
Pau) = Au) Buu)" (1+ 0(K™)), AW) = » Bu)= 
: ( cd) pu) 
We are then precisely within the conditions of the Quasi-powers Theorem (Theo- 
rem IX.8, p. 645), and the statement follows. |_| 


A prime application of the last proposition is to supercritical sequences, where the 
properties elicited in Section V. 2, p. 293, are seen to be supplemented by Gaussian 
laws. 


Proposition IX.7 (Supercritical sequences). Consider a sequence schema F = 
SEQ(uH)) that is supercritical, i.e., the value of h at its dominant positive singu- 
larity satisfies t, > 1. Assuming h to be aperiodic and h(Q) = 0, the number Xy, 
of H-components in a random F,,—-structure of large size n is, after standardization, 
asymptotically Gaussian with 

” / / 2 
(Xp) ~ ae V(X) ~ nt wth OE h'(p) 
ph'(p) ph'(p) 
where p is the positive root of h(p) = 1. 


The number X, im) of components of some fixed size m is asymptotically Gaussian 
with mean ~ O@,n, where On = hnp™ /(ph'(p)). 
Proof. The first part is a direct consequence of Proposition IX.6 with g(z) = (I—z)7! 
and pg replaced by 1. The second part results from the BGF 

1 

1— U— Dhmz™ — h(z)’ 
and from the fact that u + 1 induces a smooth perturbation of the pole of F(z, 1) at p, 
corresponding to u = 1. | 


f= SEQ(UHin +H \ Hn) =—> F(z, u)= 


The examples and notes that follow present two different types of applications 
of Propositions [IX.6 and [X.7. The first batch deals with cases already encountered 
in Chapter V, namely, surjections (Example IX.10), alignments, and compositions— 
Figure V.1 (p. 297) and Figure IX.10 illustrate typical profiles of these structures. The 
second batch shows some purely probabilistic applications to closely related renewal 
problems (Example IX.11). 
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Figure [X.10. When components are sorted by size and represented by vertical seg- 
ments of corresponding length, supercritical sequences present various profiles de- 
scribed by Proposition IX.7. The diagrams display the limit mean profiles of large 
compositions, surjections, and alignments, for component sizes < 5. 


Example YX.10. The surjection distribution. We revisit the distribution of image cardinality in 
surjections for which the concentration property has been established in Chapter V. This exam- 
ple serves to introduce bivariate asymptotics in the meromorphic case. Consider the distribution 
of image cardinality in surjections, 
1 

1—u(e% —1) 

Restrict u near 1, for instance |u — 1| < 1/10. The function F(z, uw), as a function of z, is 
meromorphic with singularities at 


F = SEQ(u SETs1(Z)) => F(z,u) = 


p(u) + 2ikx, p(u) = log (1+ ~) ‘ 


The principal determination of the logarithm is used (with p(u) near log 2 when u is near 1). It 
is then seen that p(w) stays within 0.06 from log 2, for |u — 1| < 1/10. Thus p(w) is the unique 
dominant singularity of F’, the next nearest one being p(w) +2ia with modulus certainly larger 
than 5. 

From the coefficient analysis of meromorphic functions (Chapter IV), the quantities 
Sn(u) = [z"] F(z, u) are estimated as follows, 


a 1 dz 
2 Rea a | F(z, 
- fu(w) si Cr) set oe gee a 


up (u)eP (u) fe 
It is important to note that the error term is uniform with respect to u, once u has been con- 
strained to (say) |u — 1] < 0.1. This fact is derived from the coefficient extraction method, 
since, in the remainder Cauchy integral of (35), the denominator of F(z, u) stays bounded 
away from 0. 
The second estimate in Equation (35), constitutes a prototypical case of application of the 
quasi-powers framework. Thus, the number X7, of image points in a random surjection of size n 
obeys in the limit a Gaussian law. The local expansion of p(w), 


(u)~" + O67"). 


1 3 
p(u) = log(1 +.u7!) = log2 — 5u-D+5G- rte, 
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yields 
1 3In(2) —2 

P94 (u—1) - PO? w—1 +0 (w- 09), 
plu) 2log2 8(log 2)2 

so that the mean and standard deviation satisfy 

(36) C G Cc Boa, Gye eee 

~Cin, on~ n, =o = 

re ! . a : 2log2 3 A(log 2)? 


In particular, the variability condition is satisfied. Finally, one obtains, with ® the Gaussian 
error function, 


1 
P{Xn < Cin+xJ/Con} = O(x) + O | —}. 
Jn 
This estimate can alternatively be viewed as a purely asymptotic statement regarding Stirling 
partition numbers. 
Proposition IX.8. The surjection distribution defined as a {i} with Sn = dx kit a surjec- 
tion number, satisfies uniformly for all real x and C1, C2 given by (36): 


_ yl" -— | cane o(t. 
S ‘{k V2n Joo Jn) 


: k<Cintx/Con 


This result already appears in Bender’s foundational study [35]. ...................0000- | 


> IX.24. Alignments and Stirling cycle numbers. Alignments are sequences of cycles (Chap- 
ter II, p. 119), with exponential BGF given by 


1 


F = SeQuCyc(Z)) => F,u)= 1—ulog( — 27! 


The function p(u) is explicit, p(w) = 1 — e—!/", and the number of cycles in a random align- 
ment is asymptotically Gaussian. This yields an asymptotic statement on Stirling cycle num- 
bers: Uniformly for all real x, with On = >“ ki] the alignment number, there holds 


ele >; -ale awPaw +0 (=) 
O Tk] /2n JH : 
ji k<Cin+xJ/C2n Vin Pe Me 
1 1 
,C2= : <J 
e-1 (e— 1)? 


where the two constants C1, Cz are Cy = 


> IX.25. Summands in constrained integer compositions. Consider integer compositions where 
the summands are constrained to belong to a set F C Zs, and let X; be the number of 
summands in a random composition of integer n. The ordinary BGF is 


1 
BG) AT = ay h(z) = aes 


yer 


Assume that I’ contains at least two relatively prime elements, so that h(z) is aperiodic. The 
radius of convergence of h(z) can only be oo (when A(z) is a polynomial) or 1 (when h(z) 


comprises infinitely many terms but is dominated by (1 — z)~!). In all cases, the sequence 
construction is supercritical, so that the distribution of X, is asymptotically normal. For in- 
stance, a Gaussian limit law holds for compositions into prime (or even twin-prime) summands 
enumerated in Chapter V (p. 297). dq 
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Example 1X.11. The Central Limit Theorem and discrete renewal theory. Let g(u) be any PGF 
(g(1) = 1) of a random variable supported by Z 0 that is analytic at 1 and non-degenerate (i.e., 
v(g) > 0). Then 


1 
1 — zg(u) 
has a singularity at p(u) := 1/g(u) that is a simple pole. Theorem IX.9 then applies to give 
the special form of the central limit theorem (p. 642) that is relative to discrete probability 
distributions with PGFs analytic at 1. 

Under the same analytic assumptions on g, consider now the “dual” BGF, 


F(z, u) = 


1 


G(z, u) => T— ug)’ 


where the réles of z and u have been interchanged. In addition, we must impose for consistency 
that g(0) = 0. There is a simple probabilistic interpretation in terms of renewal processes of 
classical probability theory, when g(1) = 1. Assume a light bulb has a lifetime of m days with 
probability gm = [z’"]g(z) and is replaced as soon as it ceases to function. Let Xy, be the 
number of light bulbs consumed in n days assuming independence, conditioned upon the fact 
that a replacement takes place on the nth day. Then the PGF of X», is [z”]G(z, u)/[z”]G(z, 1). 
(The normalizing quantity [z”]G(z, 1) is precisely the probability that a renewal takes place on 
day n.) Theorem IX.9 applies. The function G has a simple dominant pole at z = p(u) such 
that g(p(u)) = 1/u, with p(1) = 1 since g is by assumption a PGF. One finds 


1 1 1 g"(1) + 2¢'(1) — 2¢/(1)? 
=Il+7>=~5 
p(u) g/(1) 2 g/(1)3 


Thus the limit distribution of X, is normal with mean and variance satisfying 


(sie 1 ses 


2 
n o 
E(Xn) ~ —, VW(Xn) ~ n-5, 
H HM 
where w := m(g) and ot = v(g) are the mean and variance attached to g. (This calcula- 
tion checks the variability condition en passant.) The mean value result certainly conforms to 
probabilistic intuition. 2.0.2... ccc een eee ee nen een ee bette enneee | 


> IX.26. Renewals every day. In the renewal scenario, no longer condition on the fact that a 
bulb breaks down on day n. Let Yy be the number of bulbs consumed so far. Then the BGF of 
Yn is found by expressing that there is a sequence of renewals followed by a last renewal that is 
to be credited to all intermediate epochs: 


Yn\-n _ 1 g(u) — g(zu) 
DS Eu )z Sieae ae 


n>1 
A Gaussian limit also holds for Y,. <i 


> IX.27. A mixed CLT-renewal scenario. Consider G(z,u) = 1/(1 — g(z, u)) where g has 
non-negative coefficients, satisfies g(1, 1) = 1, and is analytic at (z, vw) = (1, 1). This models 
the situation where bulbs are replaced but a random cost is incurred, depending on the duration 
of the bulb. Under general conditions, a limit law holds and it is Gaussian. This applies for 
instance to H(z,u) = 1/(1 — a(z)b(u)), where a and b are non-degenerate PGFs (a random 
repairman is called). dq 
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Singularity perturbation for meromorphic functions. The following analytic 
schema vastly generalizes the case of supercritical compositions. 
Theorem IX.9 (Meromorphic schema). Let F(z, u) be a function that is bivariate 
analytic at (z,u) = (0,0) and has non-negative coefficients. Assume that F(z, 1) is 
meromorphic in z < r with only a simple pole at z = p for some positive p < r. 
Assume also the following conditions. 

(i) Meromorphic perturbation: there exists € > 0 andr > p such that in the 
domain, D = {|z| <r} x {Ju—1] <€}, the function F(z, u) admits the 
representation 

B(z, u) 

C(z,u)’ 
where B(z, u), C(z, u) are analytic for (z, u) € D with B(p, 1) £0. (Thus 
p is a simple zero of C(z, 1).) 

(ii) Non-degeneracy: one has 0,C (p, 1)-0,C(p, 1) £ 0, ensuring the existence 
of a non-constant p(u) analytic at u = 1, such that C(p(u),u) = 0 and 


p(l) = p. 
(iii) Variability: one has 
a) 
v #0. 
te 

Then, the random variable X,, with probability generating function 

[c"] F(Z, u) 

[2"] F(z, 1) 

after standardization, converges in distribution to a Gaussian variable, with a speed 


of convergence that is O(n~'/*). The mean and the standard deviation of Xp are 
asymptotically linear in n. 


F(z,u) = 


Pn (u) == 


Proof. First we offer a few comments. Given the analytic solution p (u) of the implicit 
equation C(p(u), u) = 0, the PGF E(w”) satisfies a quasi-power approximation of 
the form A(u)(p(1)/p(u))”, as we prove below. The mean “,, and variance ae are 


then of the form 
G7 ttn = (22) n+ O(1), c= (2) n+ O(1). 
plu) plu) 


The variability condition of the Quasi-powers Theorem is precisely ensured by condi- 
tion (ii). Set 


ait 
Chi = BeiGy ©) 


(p,1) 

The numerical coefficients in (37) can themselves be solely expressed in terms of 
partial derivatives of C (z, uw) by series reversion, 

(38) 


CF 9€0,2 — 2c1,0€1,1¢€0,1 + €2,0€5 1 (eo 


CO, 1 
plu) = p-——(lu-1)- 3 
C1,0 2c7 9 
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In particular the fact that p(w) is non-constant, analytic, and is a simple root corre- 
sponds to co,1c1,0 # O (by the analytic Implicit Function Theorem). The variance 
condition is then computed to be equivalent to the cubic inequality in the cj, ;: 


2 2 2 2 
(39) PC1,0°C0,2 — P C1,0€1,160,1 + 2 €2,0€0,1° + €0,1°C1,0 + €0,1C1,0° Pp F O. 


We can now proceed with asymptotic estimates. Fix a u—domain |u — 1| < 6 such 
that B, C are analytic. Then, one has 


1 d 
alt) = [21 F.W) = f FW, 


where the integral is taken along a small enough contour encircling the origin. We 
use the analysis of polar singularities described in Chapter IV, exactly as in (35). As 
F(z, u) has at most one (simple) pole in |z| < r, we have 
Bz, 2) 1 dz 

z=p(u) 


— F(z, u)—, 
C(z, u) 21% J \zj=r ( el 


(40) Jtn(u) = Res ( 


where we may assume u suitably restricted by |u — 1| < 6 in such a way that |r — 
plu)| < xr — p)- 
The modulus of the second term in (40) is bounded from above by 


K sup) 1, 1,—11<5 |[B(z, u 
(41) — where K=- Pizi=r.lu—11<0 |B uy 
ne inf\z|=r,Ju—1| <6 IC (z, u)| 


Since the domain |z| = r, |u — 1| < disclosed, C(z, wu) attains its minimum that must 
be non-zero, given the unicity of the zero of C. At the same time, B(z, u) being 
analytic, its modulus is bounded from above. Thus, the constant K in (41) is finite. 
Trivial bounds applied to the integral of (40) then yield 
B(p(u), u) anal 2 
fa) = = ——— pu + Or), 
" CL(p(u), w) 
uniformly for uw in a small enough fixed neighbourhood of 1. The mean and variance 
then satisfy (37), with the coefficient in the leading term of the variance term that is, 
by assumption, non-zero. Thus, the conditions of the Quasi-powers Theorem in the 
form (28), p. 645, are satisfied, and the law is Gaussian in the asymptotic limit. | 


Some form of condition, such as those in (ii) and (iii), is a necessity. For in- 
stance, the functions 


1 1 1 1 
2°? 


l-z l—-m’ > 1l1—z 1—22u’ 


each fail to satisfy the non-degeneracy and the variability condition, the variance of 


the corresponding discrete distribution being identically 0. The variance is O(1) for a 
related function such as 


1 1 
1—z(u4+ 2) 4 222u ~ (1—2z)( — zu)’ 


F(z,u) = 
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which is excluded by the variability condition of the theorem—there, a discrete limit 
law (a geometric) is known to hold (p. 614). Yet another situation arises when consid- 
ering 

1 
(1 —z)(. — zu) 
There is now a double pole at 1 when u = | that arises from “confluence” at u = 1 
of two analytic branches p;(u) = 1 and p2(u) = 1/u. In this particular case, the limit 
law is continuous but non-Gaussian; in fact, this limit is the uniform distribution over 
the interval [0, 1], since 


F@,u) =14z204+H4270+u4+u%4+ 7d 4utu?t+u)t+---. 


In addition, for this case, the mean is O(n) but the variance is O(n”). Such situations 
are examined in Section IX. 11, p. 703, at the end of this Chapter. 


> IX.28. Higher order poles. Under the conditions of Theorem IX.9, a limit Gaussian law holds 
for the distributions generated by the BGF F(z, u)”. More generally, the statement extends to 
functions with an mth order pole. See [35]. <i 


F(z,u) = 


The next four applications of Theorem IX.9 are relative to runs in permutations, 
patterns in words, the perimeter of parallelogram polyominoes, and finally the analysis 
of Euclid’s algorithm on polynomials. It is of interest to note that, for runs and pat- 
terns, the BGFs were each deduced in Chapter III by an inclusion—exclusion argument 
that involves sequences in an essential way. 


Example YX.12. Ascending runs in permutations and Eulerian numbers. The exponential 
BGF of Eulerian numbers (that count runs in permutations) is, by Example III.25, p. 209, 
u(1 —u) 


aaa C= aa 


where, for u = 1, we have F(z, 1) = (1 — ga. The roots of the denominator are then 
2ija logu 
(42) pj(u) = pluy+—2", where p(w) = =~, 
u—1 u—1 
and 7 is an arbitrary element of Z. As u approaches 1, p(u) is close to 1, whereas the other 
poles p;(u) with j # O escape to infinity. This fact is also consistent with the limit form 


F(z,1l)=(- z)7} which has only one (simple) pole at 1. If one restricts u to |u| < 2, there 
is clearly at most one root of the denominator in |z| < 2, given by p(u). Thus, we have for u 
close enough to 1, 


1 
F(z, u) = Se ee + R(z, u), 
plu)—z 
with z+» R(z, uw) analytic in |z| < 2, and 
[2"1F(z, u) = pw"! + 02"). 


The variability conditions are satisfied since 


(u—1) 2 
so that p(1/p(u)) = + is non-zero. 


Proposition IX.9. The Eulerian distribution is, after standardization, asymptotically Gaussian, 
with mean and variance given by Un = (n+ 1)/2, oa? = (n+1)/12. The speed of convergence 
is O(n—!/ 2), 
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-~2t 


Figure IX.11. The diagram of poles of the BGF z } F(z, u) associated to the pat- 
tern abaa with correlation polynomial c(z) = 1+2z3, when u varies on the unit circle. 
The denominator is of degree 4 in z: one branch, p(w) clusters near the dominant sin- 
gularity p = 1/2 of F(z, 1), whereas three other singularities stay away from the disc 
|z| < 1/2 and escape to infinity as u > 1. 


This example is a famous one (see also our Invitation, p. 9) and our derivation follows 
Bender’s paper [35]. The Gaussian character of the distribution has been known for a long 
time; it is for instance to be found in David and Barton’s Combinatorial Chance [139] published 
in 1962. There are in this case interesting connections with elementary probability theory: if 
U; are independent random variables that are uniformly distributed over the interval [0, 1], then 
one has 

[2"u*] F(z, u) = P{LUy +--+» +Un| < kh. 
Because of this fact, the normal limit is thus often derived as a consequence of the Central Limit 
Theorem, after one takes care of unimportant details relative to the integer part |-| function; 
see: [139,524]. lenses sean dee aet ee Wang ec aweeae tad eM MAT bees arte eae ey | 


Example YX.13. Patterns in strings. Consider the class ¥ of binary strings (the “texts”), and 
fix a “pattern” w of length k. Let y be the number of (possibly overlapping) occurrences of 
w. (The pattern w occurs if it is a factor, i.e., if its letters occur contiguously in the text.) Let 
F(z, u) be the BGF relative to the pair (F, 7). The Guibas—Odlyzko correlation polynomial! 
relative to w is denoted by c(z) = Cw(z). We know, from Chapter I, that the OGF of words 
with pattern w excluded is 

c(Z) 
ak (1 Bz )elz)- 
By the inclusion—-exclusion argument of Chapter III (p. 212), the BGF is 

1 — (e(z) — Iu = 1) 

1—2z- (u— I(zk + (1 — 2z)(e(z) — D) 


F(z, 0) = 


F(z,u) = 


10The correlation polynomial, as defined in Chapter I (p. 60), has coefficients in {0, 1}, with [z/ Jce(z) = 
1 iff w matches its image shifted to the right by j positions. 
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Let D(z, u) be the denominator. Then D(z, u) depends analytically on z, for wu near 1 and z near 
1/2. In addition, the partial derivative Di /2, 1) is non-zero. Thus, p(w) is analytic at u = 1, 
with p(1) = 1/2 (see Figure IX.11). The local expansion of the root p(u) of D(p(u), u) follows 
from local series reversion, 


2p(u) = 1 —27*(u — 1) + (k272* — 2-*e(1/2)) (u — 12 +O (( = 1°) 


Theorem IX.9 applies. 


Proposition IX.10. The number of occurrences of a fixed pattern in a large string is, after 
standardization, asymptotically normal. The mean ttn and variance op Satisfy 


xr +00), of = (27 + 2c(1/2)) + 2-7 = 24) n + OC), 


and the speed of convergence to the Gaussian limit is O(n7*/2). 


(The mean does not depend on the order of letters in the pattern, only the variance does.) Propo- 
sition IX.10 has been derived independently by many authors and it has been generalized in 
many ways, see for instance [43, 455, 506, 564, 603] and references therein. ............. || 


> IX.29. Patterns in Bernoulli texts. Asymptotic normality also holds when letters in strings 
are chosen independently but with an arbitrary probability distribution. It suffices to use the 
weighted correlation polynomial described in Note III.39, p. 213. dq 


Example YX.14. Parallelogram polyominoes. Polyominoes are plane diagrams that are closely 
related to models of statistical physics, while having been the subject of a vast combinatorial 
literature. This example has the merit of illustrating a level of difficulty somewhat higher than 
in previous examples and typical of many “real-life” applications. Our presentation follows an 
early article of Bender [38] and a more recent paper of Louchard [419]. We consider here the 
variety of polyominoes called parallelograms. A parallelogram is a sequence of segments, 


(a1, 51], [a2, bz], ..., lam, Dm], a, S4Q°++ Sam, db) <b2 <-++ <b, 


where the a; and b; are integers with b; — a; > 1, and one takes a, = 0 for definiteness. A 
parallelogram can thus be viewed as a stack of segments (with [aj 41, Dj;41] placed on top of 
[a pp bj) that leans smoothly to the right: 


The quantity m is called the height, the quantity b,, — a, the width, their sum is called the 
(semi)perimeter, and the grand total Dae, (b; —a;) is called the area. (This instance has area 39, 
width 13, height 9, and perimeter 13 + 9 = 22.) We examine here parallelograms of fixed area 
and investigate the distribution of perimeter. 
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The ordinary BGF of parallelograms, with z marking area and u marking perimeter is!!, 


as we shall prove momentarily 
Ji (Zu) 
Jo(z,u)’ 


where Jo, J; belong to the realm of “q—analogues” and generalize the classical Bessel functions, 


Jo(q.u) = >>  AGa= >o 


n=0 n>1 


(43) F(z,u) =u 


(—1)tultgn@tD/2 (—1)lytgn@t+)/2 


(4; Qn(ugs Qn (9; Vn-1U93 Mn 


with the “g—factorial” notation being used: 
(a; gn = (1 —a)(1 — aq)» (0 — aq"). 


Combinatorially, the BGF stated by (43), is obtained in a way that is reminiscent of Exam- 
ple III.22, p. 199. Its expression results from a simple construction: a parallelogram is either an 
interval, or it is derived from an existing parallelogram by stacking on top a new interval. Let 
G(w) = G(x, y, z, w) be the OGF with x, y, z, w marking width, height, area, and length of 
top segment, respectively. The GF of a parallelogram made of a single non-zero interval is 
a(w) = a(x, y,Z, Ww) = ene, 
1—xzw 
The operation of piling up a new segment on top of a segment of length m that is represented 
by a term w” is described by 


gay! fh th Zw 1-7" w™ 
—— oe — = w —_—_—_. 
Ne XZW 1—xzw os (1 —zw)(1 — xzw) 


Thus, G satisfies the functional equation, 
xyzw xyzw 


(44) G(w) = i [G(1) — Gaxzw)]. 


—xzw (1—zw)( — xzw) 


This is the method of “adding a slice” introduced in Chapter III, p. 199, which is reflected by 
the relation (44). Now, an equation of the form, 


G(w) = a(w) + b(w)[G(1) — GQw)], 
is solved by iteration: 
GW) = a(w)+b(w)G(1) — bWw)GQw) 
(aw) — b(w)a(Aw) + b(w)b(w)a(d2w) — -- -) 
+G(1) (bw) — b(w)b(Aw) + b(w)b(Aw)bU2w) -—- -) : 


One then isolates G(1) by setting w = 1. This expresses G(1) as the quotient of two similar 
looking series (formed with sums of products of b values). Here, this gives G(x, y, z, 1), from 
which the form (43) of F(z, u) derives, since F(z, u) = G(u, u, z, 1). 


Analytically, one should first estimate [z”]F(z, 1), the number of parallelograms of size 
(i.e., area) equal ton. We have F(z, 1) = J, (z, 1)/Jo(z, 1), where the denominator is 


Jo(z, 1) =1- — + z = a " 
ssl G@-2?" G-220-2"% G-270-2)70-27 
‘Thus, F(z, 1) =7z4 222 4.423 4.924 42025 4.46254..-, corresponding to EJS A006958 (“staircase 


polyominoes”). 
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Clearly, Jo(z, 1) and J, (z, 1) are analytic in |z| < 1, and it is not hard to see that Jg(z, 1) 
decreases from 1 to about —0.24 when z varies between 0 and 1/2, with a root at 


p = 0.43306 19231 29252, 


where Jo(p, 1) = —3.76 ¥ 0, so that the zero is simple!?. Since F(z, 1) is by construction 
meromorphic in the unit disc and J; (p, 1) = 0.48 ¥ 0, the number of parallelograms satisfies 


n ~ AV (1N" an 
EL Gr) (;) Tee 


where 


a1 = 0.29745 35058 07786, a2 = 2.30913 85933 31230. 


As is common in meromorphic analyses, the approximation of coefficients is quite good; for 
instance, the relative error is only about 10-8 for n = 35. 

We are now ready for bivariate asymptotics. Take |z| < r = 7/10 and |u| < 11/10. 
Because of the form of their general terms that involve gel 2" in the numerators while the 
denominators stay bounded away from 0, the functions Jo(z, uw) and Jj (z, uw) remain analytic 
there. Thus, p(u) exists and is analytic for u in a sufficiently small neighbourhood of 1 (by 
Weierstrass preparation or implicit functions). The non-degeneracy conditions are easily veri- 
fied by numerical computations. There results that Theorem IX.9 applies. 


Proposition IX.11. The perimeter of a random parallelogram polyomino of area n admits a 
limit law that is Gaussian with mean and variance that satisfy Un ~ UN, On ~ Or/N, with 


= 0.84176 20156, o = 0.42420 65326. 


This indicates that a random parallelogram is most likely to resemble a slanted stack of 
fairly short:sé€gments: 25 66554 $245 fee Sade edn boa da Seance abdul ede ob ga veges wa bee dese | 


> IX.30. Width and height of parallelogram polyominoes are normal. Similar perturbation 
methods show that the expected height and width are each O(n) on average, again with Gauss- 
ian limit laws. dq 


> IX.31. The base of a coin fountain. A coin fountain (Example V.9, p. 330) is defined as a 
vector b = (v9, 01,..-, 0¢), Such that v9 = 0, vj > Ois an integer, og = O and |vj41—0;| = 1. 
Take as size the area, n = >'v;. Then the distribution of the base length in a random coin 
fountain of size n is asymptotically normal. (This amounts to considering all ruin sequences 
of a fixed area as equally likely, and regarding the number of steps in the game as a random 
variable.) Similarly the number of “arches” is asymptotically Gaussian. dq 


Example YX.15.  Euclid’s GCD Algorithm over polynomials. We revisit the class P C Fp[X] 
of monic polynomials in a variable X and coefficients in a prime field F , (Example I.20, p. 90). 
Size of a polynomial is identified with degree. Euclidean division applies to any pair of poly- 
nomials (u,v), with v ¥ 0: it provides a quotient (g) and a remainder (7), such that 


u=oqgt+r, with r=O or deg(r) <dego. 


!2As usual, such computations can be easily validated by carefully controlled numerical evaluations 
coupled with Rouché’s theorem (see Chapter IV, p. 263). 
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Euclid’s Greatest Common Divisor (GCD) Algorithm applies to any pair of polynomials 
(uj, ug) satisfying deg(u,) < deg(ug), proceeding by successive divisions [379]: 


ug = guy + U2 

uy = q2Uu2 + U3 
(45) 

Un-2. = Gh-1Un-1 + Uh 

Un—-1 = nuh + 0. 


The number / is the number of steps of the algorithm. (It also corresponds to the height of 
the continued fraction representation of wu; /ug: write uj /ug = 1/(q; + 1/---).) The quotient 
polynomials q;, for 1 < j < h are each of degree at least 1 and one can always normalize 
things so that the w; are monic. The last polynomial uj, is the gcd of the pair (vj, ug). (By 
convention, deg(0) = —oo, the gcd of (0, wo) is 1 and its height is 0.) 

Together with the class P, we introduce the class G of “general” (non-necessarily monic) 
polynomials and the subclass Gt of those of degree at least 1. The class F of fractions consists 
of all the pairs (u;, ug) such that: (Z) the polynomial uo is monic; (ii) either wu) = O or 
deg(u,) < deg(ug). (View the pair as representing uw, /ug.) The size of a fraction is by definition 
the degree of ug. The corresponding OGF are instantly found to be: 


(46) P= Gt(z) = 


; F@= 


P(p — Mz 

l= pz’ 1- 1 p2z 
The simple but startling fact that renders the analysis easy is the following: Euclid’s al- 
gorithm yields a combinatorial isomorphism between F-fractions and pairs composed of a 


sequence of G* polynomials (the quotients) and a P-polynomial (the gcd). In symbols: 
(47) F & SEQ(GT) x P. 
A direct consequence of (47) is the BGF of ¥, with u marking the number of steps: 


1 a 1 1 
1-uGt(z) l—pz) y—yPpwrDe 1 pz’ 
l—pz 


(48) F(Z, u) = 


Similarly, with u marking the number of quotients of some fixed degree k, one obtains the BGF 
1 1 


(49) F(,u) = - 
1 — PPB — zhu — 1) pkK(p— 1) I~ Pe 


Both cases give rise to direct applications of Theorem IX.9, p. 656, relative to the meromorphic 
schema. A simple computation then gives: 


Proposition [X.12._ When applied to a random polynomial fraction of degree n, the number of 
steps of Euclid’s algorithm is asymptotically normal with mean 


p-1l 


E(# steps) = n+ O(1), 
and variance O(n). The number of quotients of a fixed degree k is also asymptotically Gaussian, 
with mean ~ cxn and variance O(n), where ch = gt (:p- 1)’. 


Similar considerations and the methods of Section [X.2 show that the degree of the gcd 
itself is asymptotically geometric, with rate p~!. Original analyses are due to Knopfmacher- 
Knopfmacher [371] and Friesen—Hensley [270]. In such a case, the transparent character of the 
analytic-combinatorial proofs is worthy of note. ........... 0. cc eee eee cent eee e een aee | 
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> IX.32. Euclid’s integer-gcd algorithm is Gaussian. This spectacular and deep result is orig- 
inally due to Hensley [331], with important improvements brought by Baladi—Vallée [25]. The 
reference set is now the pair of integers in [1 ..], to which Euclid’s algorithm is applied. The 
number of steps has expectation 


12log2 
m2 
as first established by Dixon [166] and Heilbronn [327]; see Knuth’s book [379, pp. 356-373] 


for a good story. The proof of the Gaussian limit, following [25, 331], makes use of the transfer 
operator Gy associated with the transformation x +> {1/x} = 1/x — |1/x]; namely, 


logn + o(logn), 


co 


1 1 
Gs[ f(x) = >; ae! (a): 


n=1 


It is then proved that a bivariate Dirichlet series describing the number of steps of Euclid’s 


algorithm can be expressed in terms of the quasi-inverse (I 1G) compare with (48). 
Perturbation theory of the dominant eigenvalue 2;(s) of Gs in conjunction with the Mellin— 
Perron formula, an adapted form of singularity analysis, and the Quasi-powers Theorem (and 
hard work, as well) eventually yield the result. An operator analogue of (49) also holds, from 
which the frequency of quotient values can be quantified: the asymptotic frequency of k is 
logy(1 + 1/(K(k + 1))). See Vallée’s surveys [583, 584], Hensley’s book [332], and references 
therein for a review of these methods and many other applications. dq 


Perturbation of linear systems. There is usually a fairly transparent approach 
to the analysis of BGFs defined implicitly as solutions of functional equations. One 
should start with the analysis at uv = 1 and then examine the effect on singularities 
when wu varies in a very small neighbourhood of 1. In accordance with what we have 
already seen many times, the process involves a perturbation analysis of the solution 
to a functional equation near a singularity, here one that moves. 

We consider here functions defined implicitly by a linear system of positive equa- 
tions, nonlinear systems being discussed in the next section. Positive linear systems 
arise in connection with problems specified by finite state devices, paths in graphs, 
and finite Markov chains, and transfer matrix models (Sections V.5, p. 336 and V. 6, 
p. 356). The bivariate problem is then expressed by a linear equation 


(50) Y(z,u) = V(z,u) + T(z, u)-Y(z,u), 


where T(z, uv) is an m x m matrix with entries that are polynomial in z, wu with non- 
negative coefficients, Y(z, u) is an m x 1 column vector of unknowns, and V (z, uw) is 
a column vector of non-negative initial conditions. 

Regarding the univariate problem, 


(51) Y(z) = V(z)+T(z)- Y(2), 


where Y(z) = Y(z, 1) and so on, we place ourselves under the assumptions of Corol- 
lary V.1, p. 358. This means that properness, positivity, irreducibility, and aperiodicity 
are assumed throughout. In this case (see the developments of Chapter V), Perron— 
Frobenius theory applies to the univariate matrix T(z). In other words, the function 


C(z) = detU — T(z)) 


has a unique dominant root p > 0 that is a simple zero. Accordingly, any component 
F(z) = Yj(z) of a solution to the system (50) has a unique dominant singularity 
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at z = p that is a simple pole, 
B(z 
F(z) = Bo); 
C(z) 


with B(p) 4 0. 
In the bivariate case, each component of the solution to the system (50) can be 
put under the form 


B(z, u) 


Ree C(z,u)’ 


C(z, u) = det(I — T(z, u)). 


Since B(z,u) is a polynomial, it does not vanish for (z,u) in a sufficiently small 
neighbourhood of (p, 1). Similarly, by the analytic Implicit Function Theorem, there 
exists a function p(u) locally analytic near u = 1, such that 


C(p(u), u) = 0, p(l) =p. 
Thus, it is sufficient that the variability conditions (38) be satisfied in order to infer a 
limit Gaussian distribution. 


Theorem IX.10 (Positive rational systems). Let F(z, u) be a bivariate function that 

is analytic at (0, 0) and has non-negative coefficients. Assume that F(z, u) coincides 

with the component Y, of a system of linear equations in Y = (Y1,..., Ym)", 
Y=V+T-Y, 

where V = (V\(z,uU),.--, Vn(z,u)), T = (iG: Uy) iy and each of Vj, Tj, ; 

is a polynomial in z,u with non-negative coefficients. Assume also that T(z, 1) is 

transitive, proper, and primitive, and let p(u) be the unique solution of 


det(/ — T(p(u), u)) = 0, 


assumed to be analytic at I, such that p(1) = p. Then, provided the variability 


condition, 
a) 
v > 0, 
Ges 


is satisfied, a Gaussian Limit Law holds for the coefficients of F (z,u) with mean and 
variance that are O(n) and speed of convergence that is O(n~'/). 


Example 1X.16. Tilings. (This prolongs the enumerative discussion of Example V.18, p. 360.) 
Take a (2 x n) chessboard of 2 rows and n columns, and consider coverings with “monomer 
tiles” that are (1 x 1)-pieces, and “dimer tiles” that are either of the horizontal (1 x 2) or 
vertical (2 x 1) type. The parameter of interest is the (random) number of tiles. Consider 
next the collection of all “partial coverings” in which each column is covered exactly, except 
possibly for the last one. The partial coverings are of one of four types and the legal transitions 
are described by a compatibility graph. For instance, if the previous column started with one 
horizontal dimer and contained one monomer, the current column has one occupied cell, and 
one free cell that may then be occupied either by a monomer or a dimer. This finite state 
description corresponds to a set of linear equations over BGFs (with z marking the area covered 
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and u marking the total number of tiles), with the transition matrix found to be 
2 2 2 


u u u 
7 sr (a - 50 
ES Ee bb 8 
u 0 0 0 


In particular, we have 


det — T(z, u)) =1—zu- 2° (U2 + ie) 


Then, Theorem IX.10 applies: the number of tiles is asymptotically normal. The method clearly 
extends to (k x n) chessboards, for any fixed k (see Bender et al. (35, 46]). .............. | 


Example YX.17. Limit theorem for Markov chains. Assume that M is the transition matrix of 
an irreducible aperiodic Markov chain, and consider the parameter y that records the number of 
passages through state | in a path of length n that starts in state 1. Then, Theorem [X.10 applies 
with 

V=(1,0,...,0)7, Tj j(,u) = 2Mjj +z — Mj 16;1. 
We therefore derive a classical limit theorem for Markov chains: 


Proposition [X.13. Jn an irreducible and aperiodic (finite) Markov chain, the number of times 
that a designated state is reached when n transitions are effected is asymptotically Gaussian. 


The conclusion also applies to paths in any strongly connected aperiodic digraph as well 
as to paths conditioned by their source and/or destination. ............... 00... e eee eee a 


> IX.33. Sets of patterns in words. This note extends Example IX.13 (p. 659) relative to the 
occurrence of a single pattern in a random text. Given the class WV = SEQ(A) of words over 
a finite alphabet .A, fix a finite set of “patterns” S C W and define y(w) as the total number 
of occurrences of members of S$ in the word w € W. It is possible to build finite automaton 
(essentially a digital tree built on S equipped with return edges) that records simultaneously the 
number of partial occurrences of each pattern. Then, the limit law of y is Gaussian; see Bender 
and Kochman’s paper [43], the papers [240, 263] for an approach based on the de Bruijn graph, 
(30, 457] for an inclusion—exclusion treatment, and [564] for a perspective. 


> IX.34. Constrained integer compositions. Consider integer compositions where consecutive 
summands add up to at least 4. The number of summands in such a composition is asymptoti- 
cally normal [46]. Similarly for a Carlitz composition (p. 201). 


> IX.35. Height in trees of bounded width. Consider general Catalan trees of width less than a 
fixed bound w. (The width is the maximum number of nodes at any level in the tree.) In such 
trees, the distribution of height is asymptotically Gaussian. 


IX.7. Perturbation of singularity analysis asymptotics 


In this central section, we examine analytic-combinatorial schemas that arise 
when generating functions contain algebraic—logarithmic singularities. The under- 
lying machinery is the method of singularity analysis detailed in Chapters VI and VII, 
on which suitable perturbative developments are grafted. 

An especially important feature of the method of singularity analysis, stemming 
from properties of Hankel contours, is the fact that it preserves uniformity of expan- 
sions!>. This feature is crucial in analysing bivariate generating functions, where we 


!3For instance, Darboux’s method discussed in Section VI. 11, p. 433, only provides non-effective 
error terms, since it is based on the Riemann—Lebesgue lemma, so that it cannot be conveniently employed 
for bivariate asymptotics. A similar comment applies to Tauberian theorems. 
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need to estimate uniformly a coefficient f,(u) = [z” ]F (z, u) that depends on the para- 
meter u, given some (uniform) knowledge on the singular structure of F(z, u), as a 
function of z. It is from such estimates that limit Gaussian laws can typically be de- 
rived via quasi-power approximations and the Quasi-powers Theorem (Theorem IX.8, 
p. 645). 

In this section, we shall encounter two different types of situations, depending on 
the way the deformation induced by the secondary parameter affects the singularity 
of the function z +» F(z,u), when u is near 1. In accordance with the preliminary 
discussion of singularity perturbation and Gaussian laws, on p. 648, regarding the PGF 
Pn(t) := fn(u)/fnC), there is a fundamental dichotomy, depending on whether it is 
the singular exponent that varies or the dominant singularity that moves. 


— Variable exponent. This corresponds to the case where the dominant singu- 
larity of z +» F(z, u) remains a constant p, but the singular exponent a(u) 
in the approximation F(z, u) © (1—z/p)~*™ varies smoothly, to the effect 
that p,(u) + n®)-@()_ We then have a Gaussian limit law in the scale of 
log n for the mean and the variance. 

— Movable singularity. This is the case where the singular exponent retains 
a constant value a, but the dominant singularity p(u) in the approximation 
F(z, u) © (1—z/p(u))% moves smoothly with u, to the effect that p,(u) ~ 
(p(1)/p(u))”. There is again a Gaussian limit law, but a mean and variance 
that are now of the order of n. 


The case of a variable exponent typically arises from the set construction, in the 
context of the exp—log schema introduced in Section VII. 2 (p. 445), which covers the 
cycle decomposition of permutations, connected components in random mappings, as 
well as the factorization of polynomials over finite fields. The mean value analyses 
of Chapter VII are then nicely supplemented by limit Gaussian laws, as we prove in 
Subsection [X.7.1. Trees often lead to singularities that are of the square-root type 
and such a singular behaviour persists for a number of bivariate generating functions 
associated to additively inherited parameters (for instance the number of leaves). In 
that case, the singular exponent remains constant (equal to 1/2), while the singularity 
moves. The basic technology adequate for such movable singularities is developed 
in Subsection IX. 7.2, where it is illustrated by means of simple examples relative to 
trees. 

A notable feature of complex analytic methods is to be applicable to functions 
only known implicitly through a functional equation of sorts. We study implicit sys- 
tems and algebraic functions in Subsection [X.7.3: there, movable singularities are 
found, resulting in Gaussian limits in the scale of n. Differential systems display a 
broader range of singular behaviours, as discussed in Subsection IX. 7.4, to the effect 
that Gaussian laws can arise, both in the scale of log n and of n. 


IX. 7.1. Variable exponents and the exp-log schema. The organization of this 
subsection is as follows. First, we state an easy but crucial lemma (Lemma IX.2) 
that takes care of the remainder terms in the expansions and hence enables the use of 
singularity analysis in a perturbed context. Then, we state a general theorem relative 
to the case of a fixed singularity and a variable exponent (Theorem IX.11). The major 
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application is to the analysis of the exp—log schema as introduced in Section VII. 2, 
p. 445: Gaussian laws in the scale of logn are found to hold true for the number of 
components in several of the most classical structures of combinatorial theory. 


Uniform expansions. The basis of the developments in this section is a unifor- 
mity lemma obtained from a simple re-examination of basic singularity analysis in the 
perspective of bivariate asymptotics. 

Lemma IX.2 (Uniformity lemma, singularity analysis). Let f,,(z) be a family of func- 
tions analytic in a common A-domain A, with u a parameter taken in a set U. Sup- 
pose that there holds 


(52) Ife@l < K@)|d = 2-2 


where K(u) and a(u) remain absolutely bounded: K(u) < K and |a(u)| < A, 
foru € U. Let B be such that R(a(u)) < —B. Then, there exists a constant A 
(computable from A, B, A) such that 


(53) Itz") fulz)| < AKnBOt, 


Proof. It suffices to revisit the proof of the Big-Oh Transfer Theorem (Theorem VI.3, 
p. 390), paying due attention to uniformity. The proof starts from Cauchy’s formula, 


1 d 
Fucn = [2" | fu(z) = = / ful) aD 
A 


; zeEA, uel, 


where y = J ay is the Hankel contour displayed in Figure V1.6, p. 390. This contour 
is comprised of an inner circular arc (y1), an outer arc (y4), and two connecting linear 
parts (y2, 3); its half-angle is 0. 

Decompose a (uw) into its real and imaginary parts and set a(u) = o(u) + it(u). 
Also, set z = 1+t/n, so that t lies on an image contour y = —1 + 7A and write 


t = pe’. We have 
t —it(u) 
(-;) 


with |c(u)| < A. As ¢ varies along y, its argument € decreases continuously 
from 22 — @ to 6. Thus, the second factor on the right of (54) remains bounded 


independently of n: 
t —it(u) pels —it(u) 
n n 


for some computable 4; > 0. In summary, we have found, for z on y, 


5) 


(54) jJa-2-%@ 


ee la =) 


<A1, 


5) 


(55) Ja-22@ 


25, la Syyete) 


where o (uw) is real and —o(u) > B. 
At this final stage, making use of (55), we can bound [z”] f(z) by a curvilinear 


integral: 
ldz| 
[z|2tl . 


A 
Ie"Mie@ls = f |a-a- 
Y 
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A direct application of the majorizations used in the proof of Theorem VI.3 then es- 
tablishes the statement. | 


> IX.36. Uniformity in the presence of logarithmic multipliers. Similar estimates hold when 
J (Z) is multiplied by a power of L(z) = —log(1 — z): if the condition (52) is replaced by 


lfu2l < KW | — 27? | IL@OV, 


for some B € R, then one has 


IL" ful@)| < AKnP~Mogny?, 

for some 4 = 7(A, B, A, £) (compare with (53)). J 

The prototypical instance of a bivariate GF with a fixed singularity and a variable 
exponent is that of F(z, uw) := C(z)~*™). We can in fact state a slightly more general 
result guaranteeing the presence of a Gaussian limit law in this and similar cases. 
Theorem IX.11 (Variable exponent perturbation). Let F(z, u) be a bivariate func- 
tion that is analytic at (z,u) = (0,0) and has non-negative coefficients. Assume the 
following conditions. 

(i) Analytic exponents. There exist € > O andr > p such that, with the domain D 
defined by 

D={(z,u) | kl <rlu—-ll<e}, 

the function F(z, u) admits the representation 
(56) FG, u) = AG, u) + BE, WC) 
where A(z, u), B(z,u) are analytic for (z,u) € D. Suppose also that the function 
a(u) is analytic in |u — 1| < € witha(1) ¢ {0, -1, —2, ...} and C(z) is analytic for 
|z| <r, with the equation C(z) = 0 having a unique root p € (0,1r) in the disc |z| < r 
that is simple and such that B(p, 1) # 0. 

(ii) Variability: one has 

a’(1) +a"(1) £0. 

Then the variable with probability generating function 
[2"] F(z, u) 
[2"] F(z, 1) 
converges in distribution to a Gaussian variable with a speed of convergence 


O((logn)~!/?). The corresponding mean [tn and variance 0,7 satisfy 


Pn (u) = 


Ln ~ a’(1) logn, a7 ~ (a'(1) +.a"(1)) logn. 


Proof. Clearly, for the univariate problem, by singularity analysis, one has 


‘ n@Q)-1 1 
(57) [z"|F@, 1) = Bip, I)(—pC'(p)) oa (: big (;)) 


For the bivariate problem, the contribution to [z”]F(z, u) arising from [z”]A(z, u) is 
uniformly exponentially smaller than p~”, since A(z, wu) is z—analytic in |z| < r. 
Write next 
Bi, u) _ (Bz, u) _ Bip, u)) + Bip, u). 
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The first term satisfies 


B(z,u) — B(p,u) = O((% — p)), 
uniformly with respect to u, since 
B(z,u) — B(p,u) 
zZ—p 
is analytic for (z,u) € D (as seen by division of power series representations). Let 


A be an upper bound on |a(u)| for ju — 1] < €. Then, by singularity analysis and its 
companion uniformity lemma, 


(58) [" (BG, w) — B(p, w))C@) = O—™"n*™). 
By suitably restricting the domain of u, one may freely assume that A < a(1)+ 1/2 
(say), ensuring that A — 2 < a(1) — 3/2. Thus, the contribution arising from (58) is 
uniformly polynomially small (by a factor O(n~!/*)). 

It only remains to analyse 


[z"]B(p, u)C(z)™, 


This is done exactly like in the univariate case: we have, uniformly for u in a small 
neighbourhood of 1, 


(59) CQ) = (—pC'(p)) 7 — z/py 9% 1+ OU —z/p)), 
and, taking once more advantage of the uniformity afforded by singularity analysis, 
we find by (58) and (59): 


B —n 
[z" F(z, u) — (p,u)p (—pC'(p))@@n2-! (1 + o(n-"?)) . 
T(a(u)) 
Thus, the Quasi-powers Theorem applies and the law is Gaussian in the limit. | 


The exp-log schema. The next proposition covers the exponential—logarithmic 
(“exp—log”) schema of Section VII. 2, p. 445, which is amenable to singularity pertur- 
bation techniques. 

Proposition IX.14 (Sets of labelled logarithmic structures). Consider the labelled set 
construction F = SET(G). Assume that G(z) has radius of convergence p and is 
A-continuable with a singular expansion of the form 


1 1 
G(z) = x log i= +24+0 (=a = =) . 
Then, the limit law of the number of G-components in a large F—structure is asymp- 
totically Gaussian with mean and variance each asymptotic to x logn and with speed 
of convergence O((logn)~'/*). 
Proof. Use the enhanced version of the uniformity lemma in Note IX.36. A quasi- 
power approximation of the form p,(u) © n®-@), with a(u) = Ku, results from 
developments of the same type as in the proof of Theorem IX.11. | 
Clearly, all the labelled structures of Section VII. 2 (p. 445) are covered by this 
proposition. A few examples, related to permutations, 2—regular graphs, and map- 
pings, follow. 
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Example 1X.18. Cycles in derangements. The bivariate EGF for permutations with u marking 
the number of cycles is given by the specification 


k 


so that we are in the simplest case of an exp—log schema. Proposition [X.14 implies immediately 


n 
F=SetuCyc(Z)) => F(z") => tS = exp (« ges ). 
n! 1-z 


that the number of cycles in a random permutation of size n converges to a Gaussian limiting 
distribution. (This classical result stating the asymptotically normality distribution of the Stir- 
ling cycle numbers could be derived directly in Proposition IX.5, p. 645, thanks to the explicit 
character of the horizontal generating functions—the Stirling polynomials—in this particular 
case.) 

Similarly, the number of cycles is asymptotically normal in generalized derangements (Ex- 
amples II.14, p. 122 and VII.1, p. 448) where a finite set S of cycle lengths are forbidden. This 
results immediately from Proposition IX.14, given the BGF 


1 
F = SET(u Cycz, \s(2)) => F(z, u) = exp [ u | log i 


The classical derangement problem corresponds to S = {1}. 0.0... cc eee 13] 


Example YX.19. 2-regular graphs. A 2-regular graph is an undirected graph such that each ver- 
tex has degree exactly 2. Any 2—-regular graph may be decomposed into a product of connected 
components that are undirected cycles of length at least 3 (Note II.22, p. 133 and Example VII.2, 
p. 449). Hence the bivariate EGF for 2—regular graphs, with uv marking the number of connected 
components, is given by 


1 1 2 
F = SET(u UCYC33(Z)) => F(z, u) =exp { u | — log -s : 
= 2 1-z 2 4 


By the logarithmic character of the function inside the exponential, the number of connected 
components in a 2—regular graph, has a Gaussian limit distribution. ...................... | 


Example YX.20. Connected components in mappings. Mappings from a finite set to itself 
can be represented as labelled functional graphs. With u marking the number of connected 
components, the specification is (Subsection II. 5.2, p. 129 and Example VIL.3, p. 449) 


F=SetuCyc(T)) => F(a) =exn(wloe 75) 


where T(z) is the Cayley tree function defined implicitly by the relation T(z) = zexp(T (z)). 
By the inversion theorem for implicit functions (Example VI.8, p. 403), we have a square-root 


singularity, 
T(z) =1—-—/20 — ez) + OC — ez), 
1 
F(z, u) = exp (« 5 log 


5 + 0((1 =), 


From Proposition [X.14, we obtain a theorem originally due to Stepanov [559]: The number of 
components in functional digraphs has a limiting Gaussian distribution. 


so that 


l-ez 


This approach extends to functional digraphs satisfying various degree constraints as con- 
sidered in [18]. This analysis and similar ones are relevant to integer factorization, using Pol- 
lard’s “rho” method [247, 379, 538]. 20... kee cece ence enn e teen e een eee a 
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Unlabelled constructions. In the unlabelled universe, the class of all finite mul- 
tisets over a class G has ordinary bivariate generating function given by 


Uu u2 ur 
F=MSETUG) = F(z,u)= oe( foc +4 5 Ge’) + 7 E@’) te Sits ) 


where u marks the number of G-components (Chapter IID). 

The function F(z, w) is consequently of the form F(z, uv) = e“° B(z, uw), where 
B(z, u) collects the contributions arising from G(z”), G(z>), .. .. If the radius of con- 
vergence p of G(z) is assumed to be strictly less than 1, then, as it is easily checked, 
the function B(z, uv) is bivariate analytic in |u| < 1+, |z| < R for some € > 0 
and R > p. Here, we are interested in structures such that G(z) has a logarithmic 
singularity, in which case the conclusions of Proposition IX.14 relative to the con- 
struction F = MSET(uG) hold (this is verified by a simple combination of the proofs 
of Proposition IX.14 and Theorem IX.11). In summary: 


For the construction F = MSET(Q), under the assumption that p < 1 and 
G(z) is logarithmic, the number of G-components in a random Fy, structure 
is asymptotically Gaussian in the scale of logn, with speed O((logn)~"/). 


The same property also holds for the unlabelled powerset construction = PSET(G). 


In what follows, we present two illustrations, one relative to the factorization of 
polynomials over finite fields, the other to unlabelled functional graphs. 


Example 1X.21. Polynomial factorization. Fix a finite field Fp and consider the class P of 
monic polynomials (having leading coefficient 1) in the polynomial ring Fp[z], with Z the sub- 
class of irreducible polynomials. The algebraic analysis has been performed in Example I.20, 
p. 90. One has Py, = p” and 


P(z) = (1— pz)!. 


Because of the unique factorization property, a polynomial is a multiset of irreducible polyno- 
mials, whence the relation 


2 3 
Pa =ew (“2 - ic a a c Pye), 


The preceding relation can be inverted using MGbius inversion. With L(z) = log P(z), we have 


L(zk) 
I) =D) uk) —— = log 
k>1 k Wee ps k>2 


where yw is the Mobius function. 
As it is apparent, /(z) is logarithmic (it is indeed the sum of a logarithmic term and a 
function analytic for |z| < poe 2: see Example VII-4, p. 449). We have yet another instance of 


the exp—log schema (with « = 1). Hence: 

Proposition IX.15. Let Qy, be the random variable representing the number of irreducible 
factors of a random polynomial of degree n over F py, each factor being counted with its order 
of multiplicity. Then as n tends to infinity, we have, for any real x: 


—12/2 


lim P{Q, < logn+x,/logn} = dt. 
n— +00 


1 x 
— e 
ore 
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This statement, which originally appears in [258], constitutes a counterpart of the famous 
Erdés—Kac Theorem (1940) for the number of prime divisors of natural numbers (with here 
logn that replaces log logn when dealing with integers at most n; see [576]). The speed of 
convergence is once more O((log n)—V/ 2). Also, by the same devices, the same property holds 
for the parameter w, that represents the number of distinct irreducible factors in a random 
polynomial of degree n. 2.0... cece ce ten nen nen tenet ne ennennes | 


It is perhaps instructive to re-examine this last example at an abstract level, in the 
light of general principles of analytic combinatorics. 


A polynomial over a finite field is determined by the sequence of its coeffi- 
cients. Hence, the class of all polynomials, as a sequence class, has a polar 
singularity. On the other hand, unique factorization entails that a polyno- 
mial is also a multiset of irreducible factors (“primes”). Thus, the class of 
irreducible polynomials, that is implicitly determined, is logarithmic, since 
the multiset construction to be inverted is in essence an exponential oper- 
ator. As a consequence of the exp—log schema, the number of irreducible 
factors is asymptotically Gaussian. 


Example YX.22. Unlabelled functional graphs (mapping patterns). These are unlabelled di- 
rected graphs in which each vertex has outdegree equal to 1 (Chapter VII, p. 480). The specifi- 
cation of the class F¥ of such digraphs is 


F = MSET(L), L=Cyc(H)), H = Z x MSET(H), 


corresponding to multisets of cycles of rooted unlabelled trees H. 
Analytically, we know from Section VIL 5 (p. 475) relative to non-plane trees that H(z) 
has a dominant square-root singularity: 


A) =1—yVCU—2z/n) + O01 — z/n), 


where 7 = 0.33832 and y is some positive constant. As a consequence, L(z), which is obtained 
by translating an unlabelled cycle construction, is logarithmic with parameter « = 1/2. Thus: 
The number of components in a mapping pattern has a Gaussian limit distribution, with mean 
and variance each of the form } log’. FO (x Soxts Sei dkorie eS de awaadias caus te ie baes | 


> IX.37. Arithmetical semigroups. Knopfmacher [370] defines an arithmetical semigroup as a 
semigroup with unique factorization, together with a size function (or degree) such that 


Ixy] = [x1 + ly, 
and the number of elements of a fixed size is finite. If P is an arithmetical semigroup and TZ its 
set of ‘primes’ (irreducible elements), axiom A* of Knopfmacher asserts the condition 
card{x € P / |x] =n}=cq" + O0(q%") (a < J, 

with g > 1. It is shown in [370] that several algebraic structures forming arithmetical semi- 
groups satisfy axiom A*, and thus the conditions of Theorem IX.11 are automatically verified. 
Therefore, the results deriving from Theorem IX.11 fit into the framework of Knopfmacher’s 
“abstract analytic number theory”—they provide general conditions under which theorems of 
the Erdés—Kac type must hold true. Examples of application mentioned in [370] are Galois 
polynomial rings (the case of polynomial factorization), finite modules or semi-simple finite 
algebras over a finite field K = Fy, integral divisors in algebraic function fields, ideals in the 
principal order of a algebraic function field, finite modules, or semi-simple finite algebras over 
a ring of integral functions. 
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Figure [X.12. Small components of size < 20 in random permutations (left) and 
random mappings (right) of size 1000: each object corresponds to a line and each 
component is represented by a square of proportional area (for some of the mappings, 
such components may be lacking). 


> IX.38. A Central Limit Theorem on GLy(Fq). The title of this note is that of an article by 
Goh and Schmutz [297] who prove asymptotic normality for the number of irreducible factors 
that the characteristic polynomial of a random n x n matrix with entries in Fy has. [Some 
linear algebra relative to the canonical decomposition of matrices and due to Kung and Stong 
is needed.] The topic of random matrix theory over finite fields is blossoming: see Fulman’s 
survey [272]. <i 


Number of fixed-size components in the exp-log schema. As we know all too 
well, the cycle structure of permutations is a typical instance of the exp—log schema, 
where everything is as explicit as can be. The Gaussian law for the total number of 
cycles actually summarizes information relative to the number of 1—cycles, 2—cycles, 
and so on. These can be analysed separately, and we learnt in Example IX.4 (p. 625) 
that, for m fixed, the number of m—cycles is asymptotically Poisson(1/m)—in a way, 
the Gaussian law for cycles appears as the resultant of a large number of Poisson 
variables of slowly decreasing rates. As a matter of fact, similar properties hold true 
for any labelled class that belongs to the exp—log schema, namely, the number of 
m-—components is in general asymptotically Poisson(/,,), where the rate 2,, is com- 
putable and satisfies 1,, = O(1/m); see Figure IX.12 for an illustration. (The alert 
reader may have noticed that we already obtained this property directly in Proposi- 
tion VII.1 on p. 451, relative to profiles of exp—log structures, and that it is similar in 
spirit to what happens in subcritical constructions of Proposition IX.3, p. 633, although 
now the exp—log schema is critical!) Here we briefly indicate how such properties can 
be obtained by singularity perturbation: no quasi-power approximation is involved 
since a discrete-to-discrete convergence occurs, but the uniformity properties of the 
singularity analysis process, Lemma IX.2, p. 668, remains a central ingredient of the 
synthetic analysis to be developed below. 
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Example 1X.23.  Fixed-size components in sets of logarithmic structures'4. The number of 
components of some fixed size m in a set construction corresponds to the specification 


F = SET (UGm + G \ Gm)) => F(z, u) = exp (G(z) + (u—1)gmz""), 
where F(z, uw) is an exponential BGF, G(z) is an EGF, and gm := [z’”]G(z). As a consequence: 
F(z, u) = exp ((u — 1)gmz’”) F(z). 


Under the assumption that G(z) is logarithmic, one has, for u in a small neighbourhood 
of 1, as z > p ina A-domain, 


F(z, u) = e*w(u)(1—z/p)™ (1+ Odog-7(1 = z/p))), wu) = exp ((u = Hgmp™), 


the uniformity of the expansion with respect to u being granted by the same argument as in 
Proposition [X.14. By singularity analysis, it is seen that 


es w(u) 
P(x) 
Given the particular shape of w(u), this last estimate tells us that the number of m—components 


in a random F-structure of large size tends to a Poisson distribution with parameter  := 
m 


&mp - 

This result applies for any m less than some arbitrary fixed bound B. In addition, truly 
multivariate methods evoked at the end of this chapter enable one to prove that the number 
of components of sizes 1,2,..., B are asymptotically independent. This gives a very precise 
model of the probabilistic profile of small components in random F-—objects as a product of 
independent Poisson laws of parameter g,,p™ for m = 1,..., B. Similar results hold for 
unlabelled multisets, but with the negative binomial law replacing the Poisson law. ........ | 


["]F(z,u) = pink! (i + o(log~! n)) 


> TX.39. Random mappings. The number of components of some fixed size m in a large 
random mapping (functional graph) is asymptotically Poisson(A) where 2 = Kme~™/m! and 
Km = m![z'"]log(1 — T)~! enumerates connected mappings. (There T is the Cayley tree 
function.) The fact that K,,e~™/m! * 1/(2m) explains the fact that small components are 
somewhat sparser for mappings than for permutations (Figure IX.12). dq 

The last example concludes our detailed investigation of exp—log structures, and 
we may legitimately regard the most basic phenomena as well understood. Exam- 
ple IX.23 quantifies the distribution of the number of “small” components, whose 
presence is fairly sporadic (Figure [X.12) and for which an asymptotically indepen- 
dent Poisson structure prevails. Panario and Richmond [470] have further succeeded 
in proving that the size of the smallest component is asymptotically O(log n) on av- 
erage. “Large” components also enjoy a rich set of properties. They cannot be in- 
dependently distributed, since, for instance, a permutation can have only one cycle 
larger than n/2, two cycles larger than n/3, etc. As shown by Gourdon [305] under 
general exp—log conditions, the size of the largest component is O(n) on average and 
in probability, and the limit law involves the Dickman function otherwise known to 
describe the distribution of the largest prime divisor of a random integer over a large 
interval. A general probabilistic theory of the joint distribution of largest compo- 
nents in exp—log structures has been developed by Arratia, Barbour, and Tavaré [20], 
some of the initial developments of that theory drawing their inspiration from earlier 


'4This example revisits the analysis of Proposition VII.1, p. 451, under the perspective of continuity 
theorems for PGFs. 
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combinatorial—analytic studies. The joint distribution of large components appears to 
be characterized in terms of what is known as the Poisson—Dirichlet process. 


IX. 7.2. Movable singularities. In accordance with the preliminary discussion 
offered at the beginning of the section (p. 666), we now examine BGFs F'(z, u) such 
that, for the function z + F(z, wu), the exponent at the singularity retains a constant 
value, while the location of the singularity p(u) moves smoothly with u, for u kept 
in a sufficiently small neighbourhood of 1. A prototypical instance is a BGF involv- 
ing a term C(z,u)~*, when C(z, u) is bivariate analytic and C(z, 1) has an isolated 
zero at the point p = p(1). The developments in the present subsection can then be 
seen as extending the perturbative analysis of meromorphic functions in Theorem IX.9 
(p. 656), where the latter corresponds to exponents restricted toa = 1,2,.... 

This subsection provides the general machinery for addressing such fixed- 
exponent movable-singularity situations, and it is once more based on the uniformity 
afforded by singularity analysis (Lemma IX.2, p. 668). We illustrate it by means of a 
few simple examples related to trees, where BGFs are explicitly known. (The next two 
subsections will explore further applications where BGFs are only accessible indirect- 
ly, via implicit analytic (especially, algebraic) equations and differential equations.) 
Our starting point is the following general statement, which parallels Theorem IX.9, 
p. 656. 


Theorem IX.12 (Algebraic singularity schema). Let F(z, u) be a function that is 
bivariate analytic at (z,u) = (0,0) and has non-negative coefficients. Assume the 
following conditions: 


(i) Analytic perturbation: there exist three functions A, B, C, analytic in a do- 
main D = {|z| < r} x {Ju—1] < €}, such that, for some rg withO < 719 <1, 
and € > 0, the following representation’ holds, with a ¢ Z<o, 


(60) F(z,u) = A(z, u) + B(z, u)C(z, u); 


furthermore, assume that, in |z| < r, there exists a unique root p of the 
equation C(z, 1) = 0, that this root is simple, and that B(p, 1) # 0. 

(ii) Non-degeneracy- one has 0,C(p, 1)-0,C(p, 1) £ 0, ensuring the existence 
of a non-constant p(u) analytic at u = 1, such that C(p(u),u) = 0 and 


p(l) = p. 
a) 
(oR = 


(iii) Variability: one has 
Then, the random variable with probability generating function 
[2"] F(z, u) 
[2"] F(z, 1) 
converges in distribution to a Gaussian variable with a speed of convergence that is 
O(n—'/2). The mean Ln and the standard deviation On are asymptotically linear in n. 


Pn (u) cae 


ISpy unicity of analytic continuation, the representation of F(z, uw) only needs to be established ini- 
tially near (z, vu) = (0, 1), that is, for |z| < rg, for some (arbitrarily small) positive ro. 
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Proof. We start with the asymptotic analysis of the univariate counting problem. By 
the assumptions made, the function F(z, 1) is analytic in |z| < p and continuable to 
a A—domain. It admits a singular expansion of the form 


F@,1).=\@o Fai p) +) 

+ (bo + bi — p) +--+) (cre — p) +02(@ — p)? +--+) 
There, the a;,b;,cj represent the coefficients of the expansion in z of A, B, C for 
z near p when uw is instantiated at 1. (We may consider C(z, wu) normalized by the 
condition that c; is positive real, and take, e.g., cy = 1.) Singularity analysis then 
implies the estimate 


(61) 


—-a 


a1 1 
(62) [e"P G1) = bol-erp) "po (: +0 (;)) 


All that is needed now is a uniform lifting of relations (61) and (62), for u in a small 
neighbourhood of 1. 

First, we observe that, by the analyticity assumption on A, the coefficient 
[z”]A(z, u) is exponentially small compared to p~”, for u close enough to 1. Thus, 
for our purposes, we may freely restrict attention to [z”]B(z, u)C(z, u)~*. (The func- 
tion A is only needed in some cases so as to ensure non-negativity of the first few 
coefficients of F.) 

Next, we observe that there exists for uv sufficiently near to 1, a unique simple root 
p(u) near p of the equation 

C(p(u), u) = 0, 
which is an analytic function of u and satisfies p(1) = p. This results from the 
Analytic Implicit Function Theorem or, if one prefers, the Weierstrass Preparation 
Theorem: see Appendix B.5: Implicit Function Theorem, p. 753. 

At this stage, due to the changing geometry of A—domains as u varies, it proves 
convenient to operate with a fixed rather than movable singularity. This is simply 
achieved by considering the normalized function 


¥(Z,u) = Bzplu),u)C Zp), uy. 
Provided wu is restricted to a suitably small neighbourhood of 1 and z to |z| < R for 
some R > 1, the functions B(zp(u), uw) and C(zp(u), u) are analytic in both z and u 
(by composition of analytic functions), while C(zp(u), u) now has a fixed (simple) 
zero at z = 1. There results that the function 
—C pw),u) 
me 

has a removable singularity at z = 1 (by division of series expansions) and hence is 
analytic in |z| < R and |u — 1| < 6, for some 6 > 0. In particular, near z = 1, V 
satisfies an expansion of the form 


(63) P,u)= (1-2) * > mWA-2", 
n>0 


that is convergent and such that each coefficient y;(u) is an analytic function of u for 
ju—1| <0. 


678 IX. MULTIVARIATE ASYMPTOTICS AND LIMIT LAWS 


We can finally return to the analysis of [z”] F(z, u) and undo what has been done. 
We have 
[z"]F(z,u) = p(u) "[z"J¥(z, 4) + [z"JAG, uv), 
where the second term in the sum is (exponentially) negligible. Now, as we know 
from (63) and surrounding considerations, the function z +» ‘Y(z, uv) is analytic in a 
fixed A—domain, in which it admits a uniform singular approximation obtained by a 
simplification of (63), 


¥(z,u) = wow) —2)-* +0 (d-2'). 


An application of the uniformity property of singularity analysis, Lemma IX.2, then 
provides the estimate 


o-l 
(64) [z"]F(z,u) = yolu)p uy" 5 — (1 +O (+) , 
T(a@) n 
uniformly, for u restricted to a small neighbourhood of 1. 

Equation (64) shows that pn(u) = fn(u)/fnrC1), where f,(u) := [z"]F (z, uw), sat- 
isfies precisely the conditions of the Quasi-powers Theorem, Theorem IX.8. There- 
fore, the law with PGF p,(u) is asymptotically normal with a mean and a standard 
deviation that are both O(n). Since the error term in (64) is O(1/n), the speed of 
convergence to the Gaussian limit is O(1/,/7). a 


The remarks following the statement of Theorem IX.9 apply. Accordingly, the 
mean /, and variance oe are computable by the general formula (37), and the vari- 
ability condition is expressible in terms of the values of C and its derivatives at (p, 1) 
by means of Equation (39), p. 657. 


> IX.40. Logarithmic multipliers. The conclusions of Theorem IX.12 extend to functions 
representable under the more general form (k € Zs) 


F(z, u) = AG, u) + BG, WC(z,u)~ (log CG, u))*. 
(The proof follows the same pattern, based on Note IX.36, p. 669.) dq 


In the remainder of this subsection, we illustrate the use of Theorem IX.12 by 
means of examples involving an explicit fractional power of a bivariate analytic func- 
tion. Privileged cases of application of the theorem are the number of leaves in clas- 
sical varieties of trees, such as Cayley trees, general or binary Catalan trees, and 
Motzkin trees, for which the GFs lead to an explicit square-root expression. 


Example YX.24. Leaves in general Catalan trees. We revisit here under a complex asymp- 
totic angle the analysis of the number of leaves in general Catalan trees G, a problem already 
introduced in Example III.13, p. 182. The specification is 

zG(z, u) 
1 — G(z,u)’ 
with uw marking the number of leaves. The solution of the implied quadratic equation then yields 
the explicit form 


G = Zu+ Z xX SEQs1(9) => G(z,u) = zut+ 


Gew)= 5 (1+ = De y= 204 e+ w=), 
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Figure IX.13. A display of the family of GFs z} F(z, uw) corresponding to leaves 
in general Catalan trees when u € [1/2, 3/2]. It can be observed that the singularities 
are all of the square-root type, with a movable singularity at p(vu) = (1 + ul/ 2)? 
(represented by the dashed line). 


which is readily verified to be amenable to Theorem IX.12. Indeed, we have, in the notations of 
that theorem, 


1 1 
A@,u) = 5(1+@—Dz), BEwW=-5, C@u)=1—-2u 412+ U1)’, 
whose analyticity is obvious, together with the fixed exponent a = —1/2. The factorization 
Cw) = (1-2 +u)°)- (21 —0)), 


implies that the zeros of z} C(z, wu) are at (1 + xf)? In particular, if |u — 1] < 1/10 (say), 
then the dominant singularity of G(z, w) is at p(u) = (1 + /u)~2 and p = p(1) = 1/4, as it 
should be. 

The analytic perturbation assumption of Theorem [X.12 (Condition (i)) is then satisfied, 
with (say) r = 1/3. We next verify that 0,C(p, 1) = —4 and 0,C(p, 1) = —1, which en- 
sures non-degeneracy (Condition (ii)). Finally, variability (Condition (ii7)) is satisfied since 
v(p(1)/p(u)) = 1/8. Thus the theorem is applicable and the number of leaves is asymptoti- 
cally normal. 

The smooth displacement of singularities induced by the secondary variable u, which lies 
at the basis of such a Gaussian limit result, is illustrated in Figure [X.13. (Compare also with 
Figure 0.6 of our Invitation, p. 10.) ...... 0.0... cee cee ee ence te ete e cence ee nebeeeees | 


Example 1X.25. Leaves in classical varieties of trees. First, for leaves in binary Catalan trees, 
we have (Example III.14, p. 182) 


B= Zu+2(68x Z)+(Bx ZxB) = Bz,u)=zut 2zB(z,u) + Blz, uy’), 
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so that 


B(z,u?) = = (1 —27- J/(1 — 22(1 + u))(1 — 2201 — “))) ; 


This is almost the same as the BGF of leaves in general Catalan trees. The dominant singularity 
of ZH B(z, u) is at p(u) = WD and one finds b(p(1)/p(u)) = 1/16, so that the limit 
law is Gaussian. The asymptotic form of the mean and variance are also provided by p(u): 
the number of leaves Xy in a binary Catalan tree of size n satisfies E{Xn} = in + O(1) and 
o{X;}= avant O(n—1/2); the limit law is Gaussian. 

Next, comes the case of Cayley trees (Note III.17, p. 183): 


T= Zut SET3 (7) => T (z, u) = z(u -1+ ef u)y, 


(The distribution is closely related to the Stirling partition numbers.) By simple algebra, it is 
seen that the functional equation admits an explicit solution in terms of the Cayley tree function 
itself (T = ze! ): we find 


T(z,u) = z(u—1)+ T (zee-Dy, 


As we know, the function T(z) has a dominant singularity of the square-root type at e—!, so 


(65) plu) = Tela —u)), 


and we get p(1) = e—!, as we should. Accordingly, the function z+» T(z, uv) has a singularity 
of the square-root type at p(w), to which Theorem IX.12 can be applied. The expansion near 
u = 1 then comes automatically from (65): 


plu) _ 

p(l) 
Hence the mean and the variance of the number Xz of leaves in a random tree of size n satisfy 
E{Xn} ~ e7!n & 0.36787 n and o2{Xp} ~ e72(e — 2)n & 0.097202, the limit law being 
GAaUSSIAS 5 Sass. 028 cause dig astuaseod ih a ates olan bS Mia's. d Ate a ed Ligog Ma uaa agg acme sl siaaoReee pe aug | 


Poe Mwy seu = 1? 0 17): 


Example 1X.26. Patterns in binary Catalan trees. We present here a more sophisticated 
example that generalizes the problem of counting leaves in trees. It arises from the analysis of 
pattern matching and of compact representations of trees [257, 561]. The BGF of the number of 
(pruned) binary trees with z marking size and u marking the number of occurrences of a pattern 
of size m is 


(66) F(z,u) = = (1 =e a —47-A(u - nem) : 


as seen in Note III.40 (p. 213) and Note III.41 (p. 214). 

The quantity under the square-root in (66) has a unique root at p = 1/4 when u = 1, 
while it has m + 1 roots for u 4 1. By general properties of implicit and, specifically, algebraic 
functions (Implicit Function Theorem, Weierstrass Preparation), as u tends to 1, one of these 
roots, call it p(w) tends to 1/4, while all the others {p; wy y escape to infinity. We have 


L-42-42"4lw-1y 
H(z, u) = (=a = [[G-</r;), 


j=l 
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which is an analytic function in (z, w) for (z, w) in a complex neighbourhood of (1/4, 1). (This 
results from the fact that the algebraic function p(u) is analytic at u = 1.) The singular expan- 
sion of G(z, u) = zF(z, u) is then given by 


Glew) = 5 - sVEG OV I= Zp. 


Thus, we are under the conditions of Theorem IX.12. Accordingly, the number of occurrences 
taken over a random binary tree of size n + 1 has mean and variance given asymptotically 
by m((4p (u))~!)n and v((4p (u))~!)n, respectively. The expansion of p(u) at 1 is computed 
easily by iteration (“bootstrapping”) from the defining equation, 


=i- lw =t-(F-21@-1 ar = 
| Z u = Z Zz u u = ; 
to the effect that 
1 m+1 2 
RO a ee ae a oe 


Proposition IX.16. The number of occurrences of a pattern of size m in a random Catalan tree 
of sizen + 1 admits a Gaussian limit distribution, with mean {tn and variance op that satisfy 


n ss 1 %m+1 
Mn ™~ Tm? Oo, ~Nn qm ~~ 42m . 


In particular, the probability of occurrence of a pattern at a random node of a random trees 
decreases fast (the factor of 4~” in the estimate of averages) with the size of the pattern, a 
property that parallels the one already known for strings (p. 659). The paper of Steyaert and 
Flajolet [561] shows that similar properties hold for any simply generated family, at least in 
an expected value sense. Flajolet, Sipala, and Steyaert [257] build upon the foregoing analysis 
to show that the minimal “dag representation” of a random tree (where identical subtrees are 
“shared” and represented only once) is of average size O(n (log n)—1/ 2), wire dgose MN BS we | 


> IX.41. Leaves in Motzkin trees. The number of leaves in a unary—binary (Motzkin) tree is 
asymptotically Gaussian. 


> IX.42. Patterns in classical varieties of trees. Patterns in general Catalan trees and Cayley 
trees can be similarly analysed. 


IX. 7.3. Algebraic and implicit functions. Under the univariate counting sce- 
nario, we have encountered in Chapter VII many analytic-combinatorial conditions 
leading to singular exponents that are non-integral. For instance, many implicitly 
defined functions, including important algebraic cases, have a dominant singularity 
that is of the square-root type (the exponent is a = —1/2 in the notations of The- 
orem IX.12). If a corresponding specification is enriched by markers, there is a fair 
chance that the square-root singularity property will persist (as in Figure [X.13, p. 679) 
when the marking variable u remains close to 1, so that, by Theorem IX.12, a Gaussian 
law results in the scale of n. Similar comments apply to functions defined implicitly by 
systems of equations, including algebraic functions, provided suitable non-degeneracy 
conditions!® are satisfied. Here, we only state a single proposition, which is meant to 
illustrate in a simple situation the type of treatment to which implicitly defined BGFs 
can be subjected. 


!6Subsection IX. 11.2 (p. 707) below examines cases where a confluence of singularities induces a 
stable law instead of the customary Gaussian distribution. 
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Proposition IX.17 (Perturbation of algebraic functions). Let F(z, u) be a bivari- 
ate function that is analytic at (0,0) and has non-negative coefficients. Assume that 
F(z, u) is one of the solutions y of a polynomial equation 
y— O(,u, y) =0, 

where ® is a polynomial of degree d > 2 in y, such that ®(z,1, y) satisfies the 
conditions of the smooth implicit function schema of Section VII.4, p. 467, with 
G(z,w) := ®(z,1,w). Let p,t be the solutions of the characteristic system (rel- 
ative to u = 1), so that y(z) := F(z, 1) is singular at z = p and y(p) = t. Define the 
resultant polynomial (Appendix B.1: Algebraic elimination, p. 739), 


0 
AG, u) =R (> ~ O(z, Uu, y), 1- ay Uu, y), r) ’ 
so that p is a simple root of A(z, 1). Let p(u) be the unique root of the equation 


A(p(u), 4), 
analytic at 1, such that p(1) = p. Then, provided the variability condition 


1 
‘ (4 ) A: 
plu) 
is satisfied, a Gaussian Limit Law holds for the coefficients of F (z, u). 


Proof. By the developments of Theorem VII.3, p. 468, the function y(z) = F(z, 1) 
has a square-root singularity at z = p. The polynomial y — ®(p, 1, y) has a double 
(not triple) zero at y = T, so that 


a Oo 
oy yet oy = 


Thus, the Weierstrass Preparation Theorem gives the local factorization 


y- O(,u,y) = (7 +c1(z, uy + coz, u))HG,u, y), 
where H(z, u, y) is analytic and non-zero at (p, 1, 7) while c1(z, u), c2(z, u) are ana- 


lytic at (z, uv) = (p,7). 
From the solution of the quadratic equation, we must have locally 


y= ; (-c1t, u) + Vey (z, u)? — 4e2(z, u)) : 


Consider first (z, uw) restricted by 0 < z < pandO < u < 1. Since F(z, u) is real 
there, we must have c(z, u)* — 4c2(z, u) also real and non-negative. Since F(z, u) is 
continuous and increasing with z for fixed u, and since the discriminant cj(z, u)* — 
4co(z, u) vanishes at 0, the determination with the minus sign has to be constantly 
taken. In summary, we have 


(67) F(z, u) = ; (-citz, u) — V1 (z, u)? — 4c2(z, u)) , 


Set D(z, u) := c1(z, u)? — 4c2(z, u). The function D(z, 1) has a simple real zero 
at z = p. Thus, by the Analytic Inverse Function Theorem (or Weierstrass preparation 
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again), there is locally a unique analytic branch of the solution to C(p(u), uw) = 0 such 
that p(1) = p, and D(z, u) factorizes as 


D(z, u) = (p(u) — z)K(, u), 


for some analytic K satisfying K (p, 1) 4 0. The conditions of Theorem IX.12 there- 
fore hold. The stated Gaussian law follows. | 


The last proposition asserts that, under certain conditions, the only possible dom- 
inant singularity of the function z +» F(z, u) is a smooth lifting of the singularity 
of the univariate GF F(z, 1), while the nature of the singularity does not change—it 
remains of the square-root type. Similar results, established by similar methods, hold 
true for more general equations and systems, under suitable non-degeneracy and vari- 
ability conditions. Indeed, one can go all the way from algebraic functions defined by 
a single polynomial equation, as above, to functions implicitly defined by systems of 
analytic equations. This has been done by Drmota in an important paper [172]. For a 
system y = ®(z, u, y), the approach consists of looking at the Jacobian of the trans- 
formation, as in Subsection VII. 6.1 (p. 482) and imposing conditions that allow for a 
smooth singularity displacement. The Weierstrass Preparation Theorem normally pro- 
vides the needed permanence of analytic relations that imply a persistent square-root 
singularity 

The scope of Theorem IX.12, Proposition [X.17, and their derivative products 
is enormous—potentially, all the recursive combinatorial structures examined in Sec- 
tions VII. 3-VII. 8 (pp. 452-518) are concerned. This includes trees of various sorts, 
mappings, lattice paths and their generalizations, planar maps, as well as languages 
and classes described by context-free specifications, to name a few. 


Example YX.27. A pot-pourri of Gaussian laws. In the list that follows, all the mentioned 
parameters obey a Gaussian limit distribution in the scale of n. The proofs (omitted) involve in 
each case a precise investigation of the perturbation of univariate singular expansions induced 
by the secondary parameter, in a way similar to Theorem IX.12. 


Simple varieties of trees, p. 452. The number of leaves is Gaussian (see Examples IX.24 
and IX.25 above) and the property extends to the number of nodes of any fixed degree r as well 
as to the number of occurrences of any fixed pattern (see Example [X.26). This property also 
holds true for simple varieties of trees introduced in Section VII. 3, and it extends to unlabelled 
non-plane trees [121]. 

Mappings, p. 462. The number of points with r predecessors is Gaussian, as is the car- 
dinality of the image set, the property being also true for mappings defined by degree restric- 
tions [18, 247]. 

Irreducible context-free structures, p. 482. Examples given in the paper of Drmota [172] 
are the number of independent sets in a random tree and the number of patterns in a context-free 
language. 

Non-crossing graphs, p. 485. The number of connected components and the number of 
edges in either forests or general non-crossing graphs is Gaussian [245]. (These properties are 
thus in sharp contrast with those of the usual random graph model of Erdés and Rényi [76].) 

Walks in the discrete plane, p. 506. The number of steps of any fixed kind is Gaussian 
for walks, excursions, bridges and meanders. An extension of the known methods shows that 
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the number of occurrences of any fixed pattern (made of contiguous letters) is also asymptoti- 
cally normal. For instance, the number of occurrences of the pattern up-down-up-up-down in a 
random Dyck word (excursion) satisfies this property. 

Planar maps, p. 513. The number of occurrences of any fixed submap is asymptotically 
Gaussian (see [278] for a proof based on moment methods). Thus, maps are like words and 
trees: any fixed collection of patterns occurs in a large enough random object with high proba- 
bility (Borges’ Theorem, p. 61). 2.0... eee cence nee n cence nee n ees | 


IX. 7.4. Differential equations. We have encountered in this book sporadic 
combinatorial classes whose GFs are determined as solutions of ordinary differen- 
tial equations (ODEs), and we have presented in Section VII. 9 (p. 518) several such 
structures that are amenable to singularity analysis. Basic parameters are then likely 
still to lead to ODEs, but ones that are now parameterized by the secondary vari- 
able u. (By contrast, partial differential equations have so far been only scarcely used 
in analytic combinatorics.) In such cases, a singularity perturbation analysis is often 
feasible. Both situations, that of a variable exponent and that of a movable singularity, 
can occur, as we now illustrate, largely by means of examples. The partial treatment 
given here should at least convey the spirit of the singularity perturbation process, in 
the context of differential equations. 


Linear differential equations. ODEs in one variable, when linear and when hav- 
ing analytic coefficients, admit solutions whose singularities occur at well-defined 
places, namely those that entail a reduction of order (see Subsection VII. 9.1, p. 518, 
and Section VIII.7, p. 581, for the so-called “regular and “irregular cases, respec- 
tively). The possible singular exponents of solutions are then obtained as roots of a 
polynomial equation, the indicial equation. Such ordinary differential equations are 
usually a reflection of a combinatorial decomposition of sorts, so that suitably param- 
eterized versions open access to a number of combinatorial parameters. In the cases 
considered here, the ODE satisfied by a BGF F(z, uw) remains an ODE in the main 
variable z that records size, while the auxiliary variable u only affects coefficients. 
We start with a simple example, Example [X.28, relative to node levels in increas- 
ing binary trees, continue with a general statement, Proposition IX.18 relative to the 
case of a variable exponent in a linear ODE, and conclude with an application to node 
levels in quadtrees in Example IX.29. 


Example YX.28. Node levels in increasing binary trees. Increasing binary trees are labelled 
(pruned) binary trees, such that any branch from the root has monotonically increasing labels. 
As explained in Example II.17 (p. 143), these trees are an important representation of permuta- 
tions. Their specification, in terms of the boxed product of Chapter II, is 


z 
(68) ane (z +F+F) — F()=1 +/ F(t) dt, 
0 
and, accordingly, their EGF is 
1 n 
F@)=——= yi ni, 
1-—z n!} 
n>0 


Let F(z, u) be the BGF of trees where u records the depth of external nodes. In other 
words, fn.k = [2?uk | F (z, uw) is such that wt Jn,k vepresents the probability that a random 
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external node in a random tree of size n is at depth k. (The probability space is then a product 

set of cardinality (7 + 1) -n!, as there are n! trees each containing (n + 1) external nodes. By 

a standard equivalence principle, the quantity at Jn,k also give the probability that a random 

unsuccessful search in a random binary search tree of size n necessitates k comparisons.) 
Since the depth of a node is inherited from subtrees, the function F(z, uv) satisfies the 

linear integral equation derived from (68) (see also Equation (VI.67), p. 429 in relation to the 

BST recurrence), 

dt 

l-?? 


z 
(69) Feu) =1+2u [ F(t, u) 
0 
or, after differentiation, 
0 2 
“ F@,u)=——F(z,u), FO, u) =1. 
Oz 1-z 
This equation is nothing but a linear ODE, with u entering as a parameter in the coefficients, 


d 2u 
poe Sah. IO=0, 
z l-z 
the solution of any such separable first-order ODE being obtained by quadratures: 
1 
Cea 
From singularity analysis, provided u avoids {0, —1/2, —1,...}, we have 


n2u-l 1 
ful) = [e"IF(G.) = T— (14+0(2)), 


and a uniform approximation holds, provided (say) |u—1| < 1/4. Thus, Theorem IX.11 applies, 
to the effect that the distribution of the depth of a random external node in a random increasing 
binary tree, with PGF fr (u)/fn(1), admits a Gaussian limit law. 
Naturally, explicit expressions are available in such a simple case, 
tru) 2u- (Qu +1)---Qu+n-—1) 
fn(Q1) (n+ 1)! ; 
so a direct proof of the Gaussian limit in the line of Goncharov’s theorem (p. 645) is clearly 
possible; see Mahmoud’s book [429, Ch. 2], for this result originally due to Louchard. What is 
interesting here is the fact that F(z, u) viewed as a function of z has a singularity at z = 1 that 
does not move and, in a way, originates in the combinatorics of the problem, through the EGF 
of permutations, (1 — z)~!. The auxiliary parameter u appears here directly in the exponent, so 
that the application of singularity analysis or of the more sophisticated Theorem IX. 11, (p. 669) 
is immediate. 
A similar Gaussian law holds for levels of internal nodes, and is proved by similar devices. 
The Gaussian profile is even perceptible on single instance. In particular, Figure III.18 (p. 203) 
suggests a much stronger “functional limit theorem” for these objects (namely, almost all trees 
have an approximate Gaussian profile): this property, which seems currently beyond the scope 
of analytic combinatorics, has been proved by Chauvin and Jabbour [114] using martingale 
HE ORY. eg ater arr aceeataussty Obstacle Shade sya tn candi ig Gets trae tisha aeratalte UAL UN Sh rer Rad oti A | 


F(z, u) = 


Proposition IX.18 (Linear differential equations). Let F(z, u) be a bivariate generat- 
ing function with non-negative coefficients that satisfies a linear differential equation 
OF  ay(z,u) a"—'F ay (z,u 
ao(z, u) + it ) =I Rides Gr GU) ay 
Oz" (p—z) Oz" (p —z)" 


> 
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with aj(z, u) analytic at p, and ag(p,1) 4 0. Let fn(u) = [z"] F(z, u), and assume 
the following conditions: 


e [Non-confluence] The indicial polynomial 


(70) I(a) = ao(p, D@)ry +a, @e-y +++ +4r(p, YD) 

has a unique root o > 0 which is simple and such that all other roots a # 6 
satisfy R(a) <0; 

e [Dominant growth] f,(1) ~ C - p~"n?—!, for some C > 0. 

e [Variability condition] 

0(fn(u)) 
sup ————— > 
logn 


0. 


Then the coefficients of F(z, u) admit a limit Gaussian law. 


Proof. (See the paper by Flajolet and Lafforgue [243] for a detailed analysis and the 
books by Henrici [329] and Wasow [602] for a general treatment of singularities of lin- 
ear ODEs.) We assume in this proof that no two roots of the indicial polynomial (70) 
differ by an integer. Consider first the univariate problem, for which we summarize 
the discussion started on p. 518. A differential equation, 

d'F a\(z) d’'F a; (Z) 
oe HO) gee @-jdav tT GH ar =H 
with the aj(z) analytic at p and aj(p) ¥ 0 has a basis of local singular solutions 
obtained by substituting (p — z)~% and cancelling the terms of maximum order of 
growth. The candidate exponents are thus roots of the indicial equation, 


J (4) = ag (P)\@)iryy + A(P)@)(r-1) +++ + a (p) = 9. 


If there is a unique (simple) root of maximum real part, a, then there exists a solution 
to (71) of the form 


N@=@-z2“hi(p - 2), 
where /1(w) is analytic at 0 and 41(0) = 1. (This results easily from a solution by 
indeterminate coefficients.) All other solutions are then of smaller growth and of the 
form 
Yj(z) = (p — 2) “/hj(p — z) (log(z — p))*, 
for some integers k; and some functions h ;(w) analytic at w = 0. Then, F(z) has the 


form 
: 


F(z) = YaGu@ 
j=l 
Then, provided c; 4 0, 
(a re 1 

z"|F(z) = —~p “n+ 0()). 

["]F @) = 5 Gi (1+ o0(1)) 
Under the assumptions of the theorem, we must have o = a1, andc,; 0. (The reality 
assumption on o is natural for a series F(z) that has real coefficients.) 

When u varies in a neighbourhood of 1, we have a uniform expansion 


(72) F(z,u) = ci(u)(p — 2)? Ay (p — z,u). + 0(1)), 
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for some bivariate analytic function Hj(w,u) with H,(0,u) = 1, where o(u) is the 
algebraic branch that is a root of 


J (a, u) = ao(p, uy(a)r) + 41(p, U)(a)(r-1) + +++ + ar (p, u) = 0, 
and coincides with o at u = |. By singularity analysis, this entails 


ro)” 
uniformly for u in a small neighbourhood of 1, with the error term being O(n~“) for 
some a > 0. Thus Theorem IX.11 (p. 669) applies and the limit law is Gaussian. 

The crucial point in (72) and (73) is the uniform character of expansions with 
respect to u. This results from two facts: (i) the solution to (71) may be specified 
by analytic conditions at a point zo such that z) < p and there are no singularities 
of the equation between zo and p; and (ii) there is a suitable set of solutions with an 
analytic component in z and u and singular parts of the form (p — z)~7/™, as results 
from the matrix theory of differential systems and majorant series. (This last point is 
easily verified if no two roots of the indicial equation differ by an integer; otherwise, 
see [243] for an alternative basis of solutions for u near 1, u # 1.) | 


(73) [z"1F(z, u) = —™n? )-1(1 + 0(1)), 


Example YX.29. Node levels in quadtrees. Quadtrees defined in Example VII.23 (p. 522) 
are one of the most versatile data structures known for managing collections of points in multi- 
dimensional space. They are based on a recursive decomposition similar to that of binary search 
trees and increasing binary trees of the previous example. 

This example is borrowed from [243]. We fix the dimension d > 2 of the ambient data 
space. Let f,, be the number of external nodes at level k in a quadtree of size n grown by 
random insertions, and let F(z, u) be the corresponding BGF. Two integral operators play an 


essential rdéle, 
dt 


z dt : 
te = | ss Jec)= [ 80 qa: 


The basic equation that reflects the recursive splitting process of quadtrees is then (see [243] 
and Chapter VIL, p. 522 for similar techniques) 


(74) F(z,u) =1+4+24uJ¢-“I F(z, u). 
The integral equation (74) satisfied by F then transforms into a differential equation of order d, 


r'yl-4 F(z,u) = 2 uF (z, u), 


where 
Ig) =(-28'@), 3 IT's@ = 21-28’). 
The linear ODE version of (74) has an indicial polynomial that is easily determined by 
examination of the reduced form of the ODE (74) at z = 1. There, one has 


J-'g@ =I"'s@) — @—1)°s'@) ¥ ADs’). 
Thus, 
rty-4 — 2)? = 0401 — 2) 9§ + O( — 2), 
and the indicial polynomial is 
J(a,u) = gy, 
In the univariate case, the root of largest real part is a1 = 2; in the bivariate case, we have 


ay (u) = 2u'/4, 
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where the principal branch is chosen. Thus, 


fu(u) = yun (1 + 0(1)). 
By the combinatorial origin of the problem, F(z, 1) = (1 — z)~2, so that the coefficient y (1) 


is non-zero. Thus, the conditions of Proposition [X.18 are satisfied: The depth of a random 
external node in a randomly grown quadtree is Gaussian in the limit, with mean and variance 


2) 2 
Un~ zoe G7 7 oer. 
The same result applies to the cost of a (fully specified) random search, either successful or not, 
as shown in [243] by an easy combinatorial argument. ............. 0.0... eee e eee eee eee | 


From the global point of view of analytic combinatorics, it is of interest to place 
the last two examples in perspective. Simple varieties of trees, as considered in ear- 
lier subsections, are “square-root trees”, where height and depth of a random node are 
each of order ./n (on average, in distribution), while the corresponding univariate GFs 
satisfy algebraic or implicit equations and have a square-root singularity. Trees that 
in some way arise from permutations (increasing trees, binary search trees, quadtrees) 
are “logarithmic trees”: they are specified by order-constrained constructions that cor- 
respond to integro-differential operators, and their depth appears to be logarithmic 
with Gaussian fluctuations, as a reflection of a perturbative singularity analysis of 
ODEs. 


Nonlinear differential equations. Although nonlinear differential equations defy 
classification in all generality, there are a number of examples in analytic combina- 
torics that can be treated by singularity perturbation methods. We detail here the 
typical analysis of “paging” in binary search trees (BSTs), or equivalently increasing 
binary trees, taken from [235]. The Riccati equation involved reduces, by classical 
techniques, to a linear second-order equation whose perturbation analysis is particu- 
larly transparent and akin to earlier analyses of ODEs. In this problem, the auxiliary 
parameter induces a movable singularity that leads to a Gaussian limit law in the scale 
of n. 


Example 1X.30. Paging of binary search trees and increasing binary trees. Fix a “page size” 
parameter b > 2. Given a tree ft, its b—index is a tree constructed by retaining only those internal 
nodes of t which correspond to subtrees of size > b. As acomputer data structure, such an index 
is well-suited to “paging”, where one has a two-level hierarchical memory structure: the index 
resides in main memory and the rest of the tree is kept in pages of capacity b on peripheral 
storage, see for instance [429]. We let :[t] = 1,[t] denote the size —number of nodes— of the 
b-index of t. 

We consider here the analysis of paging in binary search trees, whose model is known to 
be equivalent to that of increasing binary trees. The bivariate generating function 


F(z, u) = > A(tyull] Zi 
t 


satisfies a Riccati equation that reflects the root decomposition of trees (see (68)), 


_ b+ 
(75) J ren =ereor rE (FE), F0,u) =1, 
OZ dz 1-—z 


where the quadratic relation has to be adjusted in its low-order terms. 
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The GFs of moments are rational functions with a denominator that is a power of (1 — z), 
as results from differentiation at u = 1. Mean and variance follow: 


_2@+)_, 2_ 2 &-Vbb+H 


= ; oO n+1). 
En b+2 n 3 (b +2)2 ( ) 


(The result for the mean is well-known, refer to quantity A, in the analysis of quicksort on 
p. 122 of [378].) 

Multiplying both sides of (75) by uw now gives an equation satisfied by H(z,u) := 
uF (z, u), 


a d {1-241 
—H(z,u) = H(z, u)* + u(1 — u)— ———— _], 
Oz dz l-z 


that may as well be taken as a starting point since H(z, u) is the bivariate GF of parameter 1 +7) 
(a quantity also equal to the number of external pages). The classical linearization transforma- 
tion of Riccati equations, 


_ XG, 4) 


A(z, u) = Xu)’ 


yields 


@ d f{1—2e+! 
(76) —X(z,u) + uu NAW)X(,u)=0, AG) = — (4). 
Oz dz 1-z 

with X(0, u) = 1, X,(0, u) = —u. By the classical existence theorem of Cauchy, the solution 
of (76) is an entire function of z for each fixed u, since the linear differential equation has 
no singularity at a finite distance. Furthermore, the dependency of X on u is also everywhere 
analytic; see the remarks of [602, §24], for which a proof derives by inspection of the classical 
existence property, based on indeterminate coefficients and majorant series. Thus, X (z, u) is 
actually an entire function of both complex variables z and u. As a consequence, for any fixed 
u, the function z +» A(z, u) is a meromorphic function whose coefficients are amenable to 
singularity analysis. 

In order to proceed further, we need to prove that, in a sufficiently small neighbourhood of 
u = 1, X(z, u) has only one simple root, corresponding for H(z, u) to a unique dominant and 
simple pole. This fact derives from the usual considerations surrounding the analytic Implicit 
Function Theorem and the Weierstrass Preparation Theorem (Appendix B.5: Implicit Function 
Theorem, p. 753). Here, we have X(z,1) = 1 — z. Thus, as wu tends to 1, all solutions in z 
of X(z,u) = O must escape to infinity, except for one (analytic) branch p(w) that satisfies 
pQdjy=l. 

The argument is now complete: the BGF F(z, u) and its companion H(z, u) = uF (z, u) 
have a movable singularity at p(w), which is a pole. Theorem IX.9 (p. 656) relative to the 
meromorphic case applies, and a Gaussian limit law results. ....................0..000008 | 


As shown in [235], a similar analysis applies to patterns in binary search trees. 
The corresponding properties are (somewhat) related to the analysis of local or- 
der patterns in permutations, for which Gaussian limit laws have been obtained by 
Devroye [159] using extensions of the central limit theorem to weakly dependent ran- 
dom variables. 


> IX.43. Leaves in varieties of increasing trees. Similar displacements of singularity arise for 
the number of nodes of a given type in varieties of increasing trees (Example VII.24, p. 526). 
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For instance, if é(w) is the degree generator of a family of increasing trees, the nonlinear ODE 
satisfied by the BGF of the number of leaves is 


= FG, u) = @— 1)40) + (Fe, 0). 


Whenever ¢ is a polynomial, there is a spontaneous singularity at some p(u) that depends 
analytically on u. Thus, the number of leaves is asymptotically Gaussian [49]. A similar result 
holds for nodes of any fixed degree r. dq 


IX. 8. Perturbation of saddle-point asymptotics 


The saddle-point method, which forms the subject of Chapter VII, is also 
amenable to perturbation. For instance, we already know that the number of parti- 
tions of a domain of cardinality n into classes (set partitions enumerated by the nth 
Bell number) can be estimated by this method; a suitable perturbative analysis can 
then be developed, to the effect that the number of classes in a random set partition 
of large size is asymptotically Gaussian. Given the nature of univariate saddle-point 
expansions and their diversity (they do not reduce to the p~"n% paradigm), the Quasi- 
powers Theorem ceases to be applicable, and a more flexible framework is needed. 
In what follows, we base our brief discussion on a theorem taken from Sachkov’s 
book [524]. 


Theorem IX.13 (Generalized quasi-powers). Assume that, for u in a fixed neighbour- 
hood Q of 1, the generating function py (u) of a non-negative discrete random variable 
(supported by Zs) X» admits a representation of the form 


(77) Pn(u) = exp (in(u)) + 0(1)), 


uniformly with respect to u, where each h,(u) is analytic in Q. Assume also the 
conditions, 


(78) h’ (1) +A") > d hin w) > 0 
oe) an ; 
mea (hj (1) + hy (1)? 
uniformly for u € Q. Then, the random variable 
ie Xp ¥ hi) 


"GQ + nD)? 
converges in distribution to a Gaussian with mean 0 and variance 1. 
Proof. See [524, §1.4] for details. Set 67 = h/,(1) + A” (1), and expand the char- 
acteristic function of X, at t/o,. Thanks to the form (77) and the conditions (78), 
inequalities implied by the Mean Value Theorem (Note IV.18, p. 249) give 


it/o, / it t 
hyale'’*") =h,Q)— - ~ +0(). 
On 2; 
Thus, the characteristic function of X* converges to the transform of a standard Gauss- 
ian. The statement follows from the continuity theorem of characteristic functions. Ml 
> IX.44. Real neighbourhoods. The conditions of Theorem IX.13 can be relaxed by postulating 


only that © is a real interval containing uw = 1. (Hint: use the continuity theorem for Laplace 
transforms of distributions.) <i 
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D> IX.45. Effective speed bounds. When Q is a complex neighbourhood of 1 (as stated in 
Theorem IX.13), a metric version of the theorem, with speed of convergence estimates, can 
be developed assuming effective error bounds in (77) and (78). (Hint: use the Berry—Esseen 
inequalities.) dq 

The statement above extends the Quasi-powers Theorem, and, in order to stress 
the parallel, we have opted for a complex neighbourhood condition, which has the 
benefit of providing better error bounds in applications (Note IX.45). In effect, to see 
the analogy, note that if 


hy(u) = Bn log Blu) + Alu), 


then the second quantity in (78) is O(B, vy ay uniformly. The application of this 


theorem to saddle-point integrals is in principle routine, although the manipulation of 
asymptotic scales associated with expressions involving the saddle-point value may 
become cumbersome. The fact that information for positive real values of u is suf- 
ficient (Note IX.44) may, however, help, since in applications, the GF z +» F(z, u) 
specialized for positive u stands a good chance of being an admissible function in 
the sense of Chapter VIII (p. 565), when F(z, 1) is itself admissible. General condi- 
tions have been stated by Bender, Drmota, Gardy, and coauthors [174, 279, 280, 281]. 
Broadly speaking, such situations constitute the saddle-point perturbation process. 
Once more, uniformity of expansions is an issue, which can be technically demanding 
(one needs to revisit the dependency of univariate analyses on the secondary parameter 
u © 1), but is not conceptually difficult. 


We first detail here the case of singletons in random involutions for which the 
saddle-point is an explicit algebraic function of n and u. Then, we prove the Gaussian 
character of the Stirling partition numbers, which is a classic result first obtained by 
Harper [322] in 1967. We continue with a pot-pourri of Gaussian laws, which can 
be obtained by the saddle-point method, and conclude with a note that provides brief 
indications on BGFs only indirectly accessible through functional equations, 


Example YX.31. Singletons in random involutions. The exponential BGF of involutions, with u 
marking the number of singleton cycles, is given by 


2 
F = SET (u Cyc, (4) + CYC7(Z)) => F(z, u) = exp (« + =) 7 


The saddle-point equation (Theorem VII.3, p. 553) is then 


— 4+ — —( + 1) lo. -—0 
: F 
uz n ig Z 
2=6 


This defines the saddle-point ¢ = ¢(n, u), 


u 1 
—-4+-=V4n+44u2 


2 2, 
u ue+4 1 
2 


C(n, u) 


= n eC : 
Jn eon (n~) 
where the error term is uniform for u near 1. By the saddle-point formula, one has 


1 
[c” F(z, u) ~ ViDaw &™ u),u)c(n, uy”. 
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The denominator is determined in terms of second derivatives, according to the classical saddle- 
point formula (p. 553), 


e2 2 
D(n,u) = — uz+ ——(n+ l)logz ‘ 
oz 2 z=C(n,u) 


and its main asymptotic order does not change when u varies in a sufficiently small neighbour- 
hood of 1, 
D(n, u) = 2n — uJ/n + O(1), 


again uniformly. Thus, the PGF of the number of singleton cycles satisfies 


F(¢(n, u), u) (& u) 


(79) Pa) = Fa, ) \z@,D 


). dd +o(1)). 


This is of the form 
Pn(u) = exp (in (u)) (+ 0()), 
and local expansions then yield the centring and scaling constants 


1 = 
an = hy) =Ja- 5+ O(n"), by = hy) + Ay (1) = Vn — 1+ OR"). 


Uniformity in (79) can be checked by returning to the original Cauchy coefficient integral and 
to bounds relative to the saddle-point contour. Theorem I[X.13 then applies to the effect that 
the variable Be (Xn — an) is asymptotic to a standard normal. (With a little additional care, 
one can verify that the mean “, and the standard deviation o, are asymptotic to ay, and by, 
respectively.) Therefore: 


Proposition [X.19. The number of singletons in a random involution of size n has mean ttn ~ 


1/4 


n'/2 and standard deviation on ~ n-!*; it admits a limit Gaussian law. 


A random involution thus has, with high probability, a small number of singletons. ........ | 


Example 1X.32. The Stirling partition numbers. The numbers {7} correspond to the BGF 
F = SET (u SET31(Z)) => F(z, u) = exp (u(e® - 1)) s 


The saddle-point ¢ = ¢(n, u) is determined as the positive root near n/logn of the equation 
ce& = (n+1)/u. The derivatives occurring in the saddle-point approximation are computed as 
derivatives of inverse functions in a standard way. The conditions of Theorem IX.13, together 
with the required uniformity, can then be checked. Hence: 


Proposition IX.20. The Stirling partition distribution defined by x {i} with Sy a Bell number, 
is asymptotically normal, with mean and variance that satisfy 


n 2 n 
~ : oO, ~ . 
Hn logn n (log n)2 
(See also p. 594 for first moments.) We refer once more to Sachkov’s book [524, 526] for 
computational details. .... 2.0... ccc cee nen tenet n een ene ene e ne eneeee | 


> IX.46. Harper’s analysis of Stirling behaviour. Harper’s original derivation [322] of 
Proposition [X.20 is of independent interest. Consider the Stirling polynomials defined by 
on(u) := n![z"]exp(u(e* — 1)). Each such polynomial has roots that are real, distinct, and 
non-positive. Then, for some positive 6, ;, one has 


n-1 
on(u) =u I] (1+ 5c): 


k=1 
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Thus, o,(u)/on (1) can be viewed as the PGF of the sum of a large number of independent (but 
not identical) Bernoulli variables. One then can conclude by a suitable version of the Central 
Limit Theorem. a 


Example YX.33. A pot-pourri of saddles and Gaussian laws. Theorem IX.13 combined with 
a uniformly controlled use of the saddle-point method yields Gaussian laws for most of the 
structures examined in Chapter VIII. We leave the following cases as exercises to the reader. 

Section VII. 4 (p. 558) has examined three classes, (involutions, set partitions, and frag- 
mented permutations), of which the first two have already been identified as leading to Gaussian 
laws. Fragmented permutations (p. 562) also have a number of components (fragments) that is 
Gaussian in the asymptotic limit. In this case, we have a singularity at a finite distance, which 
is of the exponential-of-a-pole type. (This last result can be rephrased as the fact that the coef- 
ficients of the classical Laguerre polynomials are asymptotically normal.) 

Saddle-point perturbation applies to the field of exponentials-of-polynomials (p. 568), 
which vastly generalizes the case of involutions: this field has been pioneered by Canfield [101] 
in 1977. The number of components is Gaussian in permutations of order p, permutations with 
longest cycle < p, and set partitions with largest block < p, with p a fixed parameter. The 
number of connected components in idempotent mappings (p. 571) is also Gaussian. 

Integer partitions have been asymptotically enumerated in VIII.6 (p. 574). As regards 
unconstrained integer partitions, the Gaussian law for the number of summands is originally 
due to Erdés and Lehner [194]. By contrast, the number of summands in partitions with distinct 
summands is not Gaussian (it is a double-exponential distribution [194]). Subtle phenomena 
are at stake in these cases, which involve Pélya operators and functions having the unit circle as 
andtural boundary. 3... ye ncdo gee fst Sead ee ela sea ee hee bd oe dy nb inns | 


> IX.47. Saddle-points and functional equations. The average-case analysis of the number of 
nodes in random digital trees or “tries” can be carried out using the Mellin transform technol- 
ogy. The corresponding distributional analysis is appreciably harder and due to Jacquet and 
Régnier [344]. A complete description is offered in Section 5.4 of Mahmoud’s book which we 
follow. What is required is to analyse the BGF 


F(z, u) = & T(z, u), 


where the Poisson generating function T (z, u) satisfies the nonlinear difference equation, 


T(z, u) = uT (=: uy +(1—-—uj)d+z)e~. 


This equation is a direct reflection of the problem specification. At u = 1, one has T(z, 1) = 1, 
F(z, 1) = e*. The idea is thus to analyse [z”] F(z, u) by the saddle-point method. 
The saddle-point analysis of F' requires asymptotic information on T(z, u) for u = el 


(the original treatment of [344] is based on characteristic functions). The main idea is to quasi- 
linearize the problem, setting 


t 


L(z,u) = log T(z, u), 


with u a parameter. This function satisfies the approximate relation L(z, u) © 2L(z/2, u), anda 


bootstrapping argument shows that, in suitable regions of the complex plane, L(z, vu) = O((|z|), 


uniformly with respect to u. The function L(z, u) is then expanded with respect to u = e'* 


at u = 1, i.e., t = 0, using a Taylor expansion, its companion integral representation, and the 
bootstrapping bounds. The moment-like quantities, 


of . 
L;(z) = —L(z,e" 
J (z) ard (z, e ) an > 
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can be subjected to Mellin analysis for 7 = 1, 2 and bounded for j > 3. In this way, it is shown 
that 


; 1 
LG, el) = Lit + shat” + Olt"), 


uniformly. The Gaussian law under a Poisson model immediately results from the continuity 
theorem of characteristic functions. Under the original Bernoulli model, the Gaussian limit 
follows from a saddle-point analysis of 


F(z, ey = ele), 


An even more delicate analysis has been carried out by Jacquet and Szpankowski [345] 
by means of analytic depoissonization (Subsection VIII. 5.3, 572). It is relative to path length 
in digital search trees and involves the formidable nonlinear bivariate difference-differential 
equation 


é z \2 
ae =F ( u) : 
See Szpankowski’s book [564] for this and similar results that play an important rdle in the 
analysis of data compression algorithms (the Lempel—Ziv schemes). dq 
At this stage, by making use of the material expounded in Sections IX. 5-IX. 8, 
we can avail ourselves of a fairly large arsenal of techniques dedicated to extracting 
Gaussian limit laws from BGFs. For instance, we now have the property that all four 


Stirling distributions, 


1l[n k! [n 1 fn k! [n 
a) leh oleh Slab Reh 
nilk On| k Sn k Rn k 


associated with permutations, alignments, set partitions, and surjections are, after 
standardization, asymptotically Gaussian. The method is in each case a reflection 
of the underlying combinatorics. Typically, for the four cases of (80), we have 
used, respectively: (i) singularity analysis perturbation (the exp—log schema for the 
SET o CYC construction of permutations); (¢i) meromorphic perturbation (for align- 
ments that are of type SEQo CYC); (iii) saddle-point perturbation (for set partitions 
that are of type SET o SET and whose BGF is entire); (iv) meromorphic perturbation 
again (for surjections that are of type SEQ o SET). 


IX.9. Local limit laws 


The occurrence of continuous limit laws has been examined so far from the angle 
of convergence of (cumulative) distribution functions. Combinatorially, regarding the 
random variable X,, that represents some parameter y taken over a class F;,, we then 


quantify the sums 
Do Fay). 


isk 

Specifically, we have focused our attention in previous sections on the case in which 
these sums (once normalized by 1/F,,) are approximated by the Gaussian “error func- 
tion”, i.e., the (cumulative) distribution function of a standard normal variable. Com- 
binatorialists would often rather have a direct estimate of the individual counting quan- 
tities, Fx, which is then a true bivariate asymptotic estimate. 

Assume that we have already obtained the existence of a convergence in law, 
Xn = Y, and the standard deviation o, of X;, tends to infinity while the distribution 
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9 0.2 0.4 0:6 0.8 
Figure IX.14. The histogram of the Eulerian distribution scaled to (n + 1) on the 
horizontal axis, for n = 3..60. (The distribution is seen to quickly converge to a 


bell-shaped curve corresponding to the Gaussian density ent /2 /Qx)'/2,) 


of Y admits a density g(x). (Here, typically, g(x) will be the Gaussian density.) If 

the Fx vary smoothly enough, one may expect each of them to share about 1/o, of 

the total probability mass, and, in addition, somehow anticipate that their profile could 

resemble the curve x } g(x). In that case, we expect an approximation of the form 
k— Un 


1 
Fk © —g (x), where x := : 
On On 


and fy is the expectation of X,,. Informally speaking, we say that a Local Limit Law 
(LLL) holds in this case. 

We examine here the occurrence of local limit laws of the Gaussian type, which 
means convergence of a discrete probability distribution to the Gaussian density func- 
tion. Figure [X.14 reveals that, at least for the Eulerian distribution (rises in permuta- 
tions), such a local limit law holds, and we know, from De Moivre’s original Central 
Limit Theorem (Note IX.1, p. 615) that a similar property holds for binomial coeffi- 
cients as well. As a matter of fact, for reasons soon to be presented, virtually all the 
Gaussian limit laws obtained in Sections IX. 5—IX. 8 admit a local version. 
Definition IX.4. A sequence of discrete probability distributions, pn, = P{Xn = 
k}, with mean [ny and standard deviation oy is said to obey a local limit law of the 
Gaussian type if, for a sequence €, — 0, 


(81) su __! ep 
P [On Pn, |unt+xon| oe < En. 
TU 


xeR 
The local limit law is said to hold with speed €y. 


Note carefully, that a local limit law does not logically follows from a convergence 
in distribution in the usual sense, upon taking differences (the individual probabilities 
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appear as differences at nearly identical points of values of a distribution function, 
hence they are “hidden” behind the error terms). Some additional regularity assump- 
tions are needed. Here, we are naturally concerned with distilling local limit laws 
from BGFs F(z, u). It turns out, rather nicely, that the Quasi-powers Theorem (Theo- 
rem IX.8, p. 645) can be amended by imposing constraints on the way the secondary 
variable affects the asymptotic approximation of [z”]F'(z, u), when u varies globally 
on the whole of the unit circle (rather than just in a complex neighbourhood of 1). In 
that case, the saddle-point method is effective to effect the inversion with respect to 
the secondary variable u. 


Theorem IX.14 (Quasi-powers, Local Limit Law). Let X, be a sequence of non- 
negative discrete random variables with probability generating function py(u). As- 
sume that the p,(u) satisfy the conditions of the Quasi-powers Theorem, in particular, 
the quasi-power approximation, 


1 
Pru) = Alu) - B(uyhn (: +0 (-)) , 


holds uniformly in a fixed complex neighbourhood Q of 1. Assume in addition the 
existence of a uniform bound, 


(82) Ipn(u)| < KPa, 


for some K > 1 and all u in the intersection of the unit circle and the complement 
C \ Q. Under these conditions, the distribution of Xn satisfies a local limit law of the 
Gaussian type with speed of convergence O(B, +x71). 


Proof. Note first that the Quasi-powers Theorem (Theorem IX.8, p. 645) provides the 
mean and variance of the distribution of X,, as quantities asymptotically proportional 
to £,. Furthermore, the standardized version of X;, converges to a standard Gaussian 
(in the sense of cumulative distribution functions). 

The idea is to use Cauchy’s formula and integrate along the unit circle. We have 

1 u 

(83) Pak = (pal) = =— ,. Pn) a 
We propose to appeal to the saddle-point method as a replacement for the continuity 
theorem of integral transforms used in the case of the central limit law (p. 645). 

We first estimate py,x when k is at a fixed number of standard deviations from the 
mean ,, namely, k = “uy, + Xo,, and accordingly restrict x to some arbitrary com- 
pact set of the real line. We can then import verbatim the treatment of large powers 
given in Section VIII. 8, p. 585. The integration circle in (83) is split into the “central 
range”, near the real axis, where | arg(u)| < 69 with 6) = n—2/>, and the remainder of 
the contour. The remainder integral is exponentially small, as is verified by the argu- 
ments of the proof of Theorem VIII.8, p. 587 and the condition (82). The perturbative 
analysis conducted in Theorem IX.14 then shows the existence of a uniform local 
Gaussian approximation (in the sense of (81)), with 6, replacing n in the statement of 
Theorem IX. 14. 

We are almost done. It suffices to observe that, as x increases unboundedly, both 
the pn,x and the Gaussian density are fast decreasing functions of x, that is, the tails 
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of the combinatorial distribution and of the limit Gaussian distribution, are both small. 
(For the px, this results from the Large Deviation Theorem, Theorem IX.15 below.) 
Thus Equation (81) actually holds when the supremum is taken over all real x (not just 
values of x restricted to compact sets). A careful revisitation of the arguments used in 
the proof then shows that the speed of convergence is, like in the central limit case, of 


the order of «7! + fn a 


This theorem applies in particular to the case of a movable singularity in a BGF 
F(z, u), whenever the dominant singularity p(u), of the function z F(z, u), as u 
ranges over the unit circle |u| = 1, uniquely attains its minimum modulus at u = 1. 
Given the positivity inherent in combinatorial GFs, we may expect this situation to oc- 
cur frequently. Indeed, for a BGF F(z, wu) with non-negative coefficients, we already 
know that the property |p(u)| < p(1) holds for u 4 1 and u on the unit circle—only 
a strengthening to the strict inequality |p(u)| < p(1) is needed. Similar comments 
apply to the case of variable exponents (where R(a(u)) should be uniquely minimal) 
and, with adaptation, to the generalized quasi-powers framework of Theorem IX.13 
(p. 690), which is suitable for the saddle-point method. These are the ultimate reasons 
why essentially all our previous central limit results can be supplemented by a local 
limit law. 


Example YX.34. Local limit laws for sums of discrete random variables. The simplest ap- 
plication is to the binomial distribution, for which B(u) = (1 + u)/2. Ina precise technical 
sense, the local limit arises from the BGF, F(z, uw) = 1/(1 — z(1 + u)/2), because the dominant 
singularity p(w) = 2/(1+u) exists on the whole of the unit circle, |u| = 1, and attains uniquely 
its minimum modulus at uv = 1, so that B(u) = p(1)/p(u) is uniquely maximal at u = 1. 

More generally, Theorem [X.14 applies to any sum S, = Tj +---+ Ty, of independent and 
identically distributed discrete random variables whose maximal span is equal to 1 and whose 
PGF is analytic on the unit circle. In that case, the BGF is 


1 
1—zB(u)’ 
the PGF of S;, is a pure power, py(u) = B(u)", and the fact that the minimal span of the X ; 


is 1 entails that B(w) attains uniquely its maximum at | (by the Daffodil Lemma IV.1, p. 266). 
Such cases have been known for a long time in probability theory. See Chapter 9 of [294]. . Hi 


F(z, u) = 


Example 1X.35. Local limit law for the Eulerian distribution. This example relative to Euler- 
ian numbers shows the case of a movable singularity, subjected to a meromorphic analysis on 
p. 658, which we now revisit. The approximation obtained there is 


Pn(u) = Bu)“"~! + 02), 
when u is close enough to 1, with 


u—1 


B(u) = pu)! = er 


A rendering of the function |B(u)| when u ranges over the unit circle is given in Figure IX.15. 
The analysis leading to (42), p. 658, also characterizes the complete set of poles p ; (u)} jeZ 
of the associated BGF F(z, u). From it, we can deduce, by simple complex geometry, that p(u) 
is the unique dominant singularity, when R(u) > 0. the other ones remaining at distance at 
least z//8 = 1.110721. Also, it is not hard to see that all the poles, including the dominant 


698 IX. MULTIVARIATE ASYMPTOTICS AND LIMIT LAWS 


Figure IX.15. The values of the function |B(u)| relative to the Eulerian distribution 
when |u| = 1, as represented by a polar plot of |B(e!”)| on the ray of angle 0. (The 
dashed contour represents the unit circle, for comparison.) The maximum is uniquely 
attained at vu = 1, where B(1) = 1, which entails a local limit law. 


one, remain in the region |z| > 11/10, when R(w) < O and |u| = 1. Thus, pn (u) satisfies an es- 
timate which is either of the quasi-powers type (when K(u) > 0) or of the form O((10/11)~”) 
(when R(u) < 0). As a consequence: a local limit law of the Gaussian type holds for the 
Eulerian distribution. (This result appears in [35, p. 107].) ..... 0... ec cece eee eee eee | 


> IX.48. Congruence properties associated to runs. Fix an integer d > 2. Let pu ) be the 
number of permutations whose number of runs is congruent to 7 modulo d. Then, there exists 


aconstant K > 1 such that, for all j, one has: ip) —n!/d| < K~". Thus, the number of runs 
is in a strong sense almost uniformly distributed over all residue classes modulo d. [Hint: use 
properties of the BGF for values of u = w“, with w a primitive dth root of unity.] dq 


Example YX.36. A pot-pourri of local limit laws. The following combinatorial distributions 
admit a local limit law (LLL). 

The number of components in random surjections (p. 653) corresponds to the array of 
Stirling) numbers kT. In that case, we have a movable singularity at p(u) = log(1 + uy, 
all the other singularities remaining at distance at least 27, and escaping to infinity as u > —1. 
This ensures the validity of condition (82), hence an LLL (with £, = n). Similarly for align- 
ments (p. 654) associated to the array of Stirling; numbers al Va various types of constrained 
compositions (p. 654), and more generally, the number of components in supercritical compo- 
sitions, including compositions into prime summands. 

Variable exponents also lead to an LLL under normal circumstances. Prototypically, the 
Stirling cycle distribution (p. 671) associated to the array A satisfies 


e(u—l) logn 
Tu) ° 


and a suitably uniform version results from the Uniformity Lemma (p. 668), hence an LLL 
(this fact was already observed in [35, p. 105]). The property extends to the exp—log schema 


Pn(u) ~ 
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including the number of components in mappings (p. 671) and the number of irreducible factors 
in polynomials over finite fields (p. 672). 

Cases of structures amenable to singularity perturbation with a movable singularity include 
leaves in Catalan and other classical varieties of trees (p. 678), patterns in binary trees (p. 680), 
as well as the mean level profile of increasing trees (p. 684), whose BGF is given by a differential 
equation. 

Finally, central limit laws resulting from the saddle-point method and Theorem IX.13 
(p. 690) can often be supplemented by an LLL. An important case is that of the number of 
blocks in set partitions, which is associated to the Stirling array {7}. (The result appears in 
Bender’s paper [35, p. 109], where it is derived from log-concavity considerations.) ....... | 


> IX.49. Non-existence of a local limit. Consider a binomial RV conditioned to assume only 
even values, so that Py 2% = gle (sn) and Pp 2k41 = 0. The BGF 


1 I I ! 
an) S57 =e faye. 2 =20= we 


has two poles, namely p; (uw) = 2/(1 + u) and p2(u) = 2/(1 — wu), and it is simply not true that 
a single one dominates throughout the domain |u| = 1. Accordingly, the PGF satisfies 


Pa) = 2-" [A +0)" +(—w)"], 


and smallness away from the positive real line cannot be guaranteed all along the unit circle 
(one has for instance pn(1) = pn(—1)). dq 


IX. 10. Large deviations 


The term large deviation principle!" is loosely defined as an exponentially small 
bound on the probability that a collection of random variables deviate substantially 
from their mean value. It thus quantifies rare events in an appropriate scale. Mo- 
ment inequalities, although useful in establishing concentration of distribution (Sub- 
section III. 2.2, p. 161), usually fall short of providing such exponentially small es- 
timates, and the improvement over Chebyshev inequalities afforded by the methods 
presented here can be dramatic. For instance, for runs in permutations (the Eulerian 
distribution), the probability of deviating by 10% or more from the mean appears to 
be of the order of 10~° for n = 1000 and 10~© for n = 10000, with a spectac- 
ular 107°? for n = 100000. (By contrast, the Chebyshev inequalities would only 
bound from above the last probability by a quantity about 1073.) Figure IX.16 pro- 
vides a plot of the logarithms of the individual probabilities associated to the Eulerian 
distribution, which is characteristic of the phenomena at stake here. 


Definition IX.5. Let £,, be a sequence tending to infinity and € a non-zero real num- 
ber. A sequence of random variables (Xn) having E(Xn) ~ EPn, satisfies a large 
deviation property relative to the interval [xo, x1] containing € if a function W(x) 
exists, such that W(x) > 0 for x 4 € and, as n tends to infinity: 


XxBn) 


XBn) = -W(x)+o00) €< x <x (right tails). 


lA 


— log P(X, 
(84) | An 
— log P(X, 


—W(x)+o00) x0o<x<€ (left tails) 


IV 


'7Large deviation theory is introduced nicely in the book of den Hollander [153]. 
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Figure [X.16. The quantities log py,xn relative to the Eulerian distribution illustrate 
an extremely fast decay away from the mean, which corresponds to é = 5. Here, the 
diagrams are plotted for n = 10, 20, 30, 40 (top to bottom). The common shape of 
the curves indicates a large deviation principle. 


The function W (x) is called the rate function and fy is the scaling factor. 


Figuratively, a large deviation property, in the case of left tails (x < €), expresses 
an exponential approximation of the rough form 


P(X < xB) © e Pn W (x) | 


for the probability of being away from the mean, and similarly for right tails. Under 
the conditions of the Quasi-powers Theorem, a large deviation principle invariably 
holds, a fact first observed by Hwang in [338]. 


Theorem IX.15 (Quasi-powers, large deviations). Consider a sequence of discrete 
random variables (X») with PGF pn(u). Assume the conditions of the Quasi-powers 
Theorem (Theorem IX.8, p. 645); in particular, there exist functions A(u), B(u), which 
are analytic over some interval (uo, U1] withO < ug < 1 < “4, such that, with ky, > 
oo, one has 


(85) Pal) = Au) Bw)" (1+ O(6;")), 


uniformly. Then the Xy satisfy a large deviation property, relative to the interval 
[xo, x1], where xq = ug B’(uo)/ Bug), x1 = u, B’(u1)/B(u1); the scaling factor is By 
and the large deviation rate W(x) is given by 


(86) W(x) =— min log (=2) 

ue[ug,uy] ux 
Proof. We examine the case of the left tails, P(X, < xf,) withx < €andé = 
B’(1), the case of right tails being similar. It proves instructive to start with a simple 
inequality that suggests the physics of the problem, then refine it into an equality by a 
classical technique known as “shifting of the mean”. 
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Inequalities. The basic observation is that, if f(u) = >; fxu® is a function 
analytic in the unit disc with non-negative coefficients at 0, then, for positive u < 1, 
we have 


(87) Sic. 
isk 


which belongs to the broad category of saddle-point bounds (see also our discussion 
of tail bounds on p. 643). The combination of (87), applied to p, (uw) := E(u*”), and 
of assumption (85) yields 


Buu) " 


ux 


(88) P(Xn < xBn) < oa ( 


which is usable a priori for any fixed u € [uo, 1]. In particular the value of u that 
minimizes B(u)/u* can be used, provided that this value of u exists, is less than 1, 
and also the minimum itself is less than 1. 

The required conditions are granted by developments closely related to Boltz- 
mann models and associated convexity properties, as developed in Note IV.46, p. 280, 
which we revisit here. Simple algebra with derivatives shows that 


/ / 
(89) d (=) = E (u) =| Bu) d (4 ~) = 1 abd); 
du \ u* B(u) uxt du \ Biu) u 

where by v,;(B(ut)) is meant the analytic variance of the functiont 1» B(ut): u 
is treated as a parameter and v(/f) is taken in the sense of (27), p. 645. From the 
non-negativity of variances, we see by the second relation of (89) that the function 
uB'(u)/B(u) is increasing. This grants us the existence of a root of the equation 
uB'(u)/B(u) = x, at which point, by the first relation of (89), the quantity B(u)/u* 
attains its minimum. Since B(1) = 1, that minimum is itself strictly less than 1, so 
that an inequality, 


(90) log P(Xn < xBn) < —BnW(x) + O(), 


results, with W(x) as stated in (86). 
Equalities. The family X,,, of random variables, with PGF 


_ Pau) 

Prd)’ 
when / varies, is known as an exponential family (or as a family of exponentially 
shifted versions of X,,). Fix now 4 to be the particular value of u at which the mini- 
mum of B(u)/u* is attained, so that AB’(A)/B(A) = x. The PGFs py,,(u) satisfy a 
quasi-power approximation 


A(iu) ( B(Au)\*" 
(91) Pn,a(t) = | ea (a+ 00;'), 


Pn,i(u) : 


so that a central limit law (of Gaussian type) holds for these specific Xy,,. By elemen- 
tary calculus, we have E(X7,,,) = xf, + O(1). Thus, by the Quasi-powers Theorem 
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applied to the centre of the Gaussian distribution, we find 


‘ 1 
(92) im. P(Xn,, < xPn) = 5 


Fix now an arbitrary € > 0. We have a useful refinement of (92): 


(93) P(x —€)fn < Xn < xfn) = ; + o(1). 


We can then write 


P(Xn < XBn) > P(«- €)Bn <Xn< XBn) 
n (A 
(94) > Pal) PCC = )Bn < Xn < xBn) 
2 


A&-©)Bn 


where the second line results from the definition of exponential families and the third 
from (93) and the quasi-powers assumption. Then, since the last line of (94) is valid 
for any € > 0, we get, in the limit € > 0, the desired lower bound: 


Bn 
G ‘. ot) BO” aaj ato), 


(95) log P(Xn < xBn) = —BnW(x) + OC), 
Hence, Equation (95) combined with its converse (90), yields the statement relative to 
left tails. | 


The proof above yields an explicit algorithm to compute the rate function W(x) 
from B(u) and its derivatives. Indeed, the quantity 2 = (x) is obtained by inversion 
of uB’(u)/B(u), 


BY(A(x)) _ 
(96) A(x) B(A(x)) =x, 
and the large deviation rate function is 
(97) W(x) = —log BUA(x)) — x log A(x). 


> IX.50. Extensions. Speed of convergence estimates can be developed by making use of the 
Quasi-powers Theorem, with error terms. Also “local” forms of the large deviation princi- 
ple (concerning log py ,) can be derived under additional properties similar to those of Theo- 
rem IX.14 (p. 696) relative to local limit laws. (Hint: see [338, 339].) <i 


Example 1X.37. Large deviations for the Eulerian distribution. In this case, the BGF has 
a unique dominant singularity for u with e < u < 1/e, and any € > 0. Thus, there is a 
quasi-power expansion with 

u—1 


Buu) = > 
log u 


valid on any compact subinterval of the positive real line. Then, A(x) is computable as the 


inverse function of 
u 1 


h(u) = - ; 
w) u-—1_ logu 


(The function h(u) maps increasingly R, 9 to the interval (0, 1), so that its inverse function is 
always defined.) The function W (x) is then computable by (96) and (97). Figure [X.17 presents 
a plot of W(x) that explains the data of Figure [X.16, p. 700, as well as the estimates given in 
the introduction of this section. 0.0.0.0... 0. cece cence cnet e nnn ene eee | 
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-0.157 
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Figure IX.17. The large deviation rate function — W(x) relative to the Eulerian dis- 
tribution, for x € [0.3, 0.7], with scaling sequence 6, =n and € = 1/2. 


All the distributions mentioned in previous pot-pourris (Example IX.27, p. 683 
and IX.36, p. 698) that result either from meromorphic perturbation or from sin- 
gularity perturbation satisfy a large deviation principle, as a consequence of Theo- 
rem IX.15. For distributions amenable to the saddle-point method (Example IX.33, 
p. 693) tail probabilities also tend to be very small: their approximations are not ex- 
pressed as simply as in Definition IX.5, but depend on the particulars of the asymp- 
totic scale at play in each case. The interest of large deviation estimates in probability 
theory stems from their robustness with respect to changes in randomness models or 
under composition with non-mass-preserving transformations. In combinatorics, they 
have been most notably used to analyse depth and height in several types of increasing 
trees and search trees by Devroye and his coauthors [95, 160, 161]. 


IX. 11. Non-Gaussian continuous limits 


Previous sections of this chapter have stressed two basic paradigms for bivariate 
asymptotics: 
— a “minor” change in singularities, leading to discrete laws, which occurs 
when the nature and location of the dominant singularity remains unaffected 
by small changes in the values of the secondary parameter u; 
— a“major” singularity perturbation mode leading to the Gaussian law, which 
arises from a variable exponent and/or a movable singularity. 


However, it has been systematically the case, so far, that the collection of singular 
expansions parameterized by the auxiliary variable all belong to a sufficiently gentle 
analytic type (eventually leading to a quasi-power approximation) and, in particu- 
lar, exhibit no sharp discontinuity when the secondary parameter traverses the special 
value u = 1. In this section we first illustrate, by means of examples, the way dis- 
continuities in singular behaviour induce non-Gaussian laws (Subsection IX. 11.1), 
then examine a fairly general case of confluence of singularities, corresponding to the 
critical composition schema (Subsection IX. 11.2). The discontinuities observed in 


704 IX. MULTIVARIATE ASYMPTOTICS AND LIMIT LAWS 


such situations are reminiscent of what is known as phase-transition phenomena in 
statistical physics, and we have found it suggestive to import this terminology here. 


IX. 11.1. Phase-transition diagrams. Perhaps the simplest case of discontinu- 

ity in singular behaviour is provided by the BGF, 
1 

d=2)0 zu): 
where u records the parameter equal to the number of initial occurrences of a in a 
random word of F = SEQ(a) SEQ(b). Clearly the distribution is uniform over the 
discrete set of values {0, 1,...}. The limit law is then continuous: it is the uniform 
distribution over the real interval [0, 1]. From the point of view of the singular struc- 
ture of z+ F(z, u), summarized by a formula of the type (1 — z/p(u))~*™, three 
distinct cases arise, depending on the values of u: 


F(z,u) = 


— u <1: simple pole at p(u) = 1, corresponding to a(u) = 1; 

— u=1s: double pole at p(1) = 1, corresponding to a(u) = 2; 

— u> 1: simple pole at p(u) = 1/u, corresponding to a(u) = 1. 
Here, both the location of the singularity p(w) and the singular exponent a (u) experi- 
ence a non-analytic transition at u = 1. This situation arises from a collapsing of two 
singular terms, when u = 1. 

In order to visualize such cases, it is useful to introduce a simplified diagram 
representation, called a phase-transition diagram and defined as follows. Write Z = 
p(u) — z and summarize the singular expansion by its dominant singular term Z?“), 
Then, the diagram corresponding to F(z, w) is 


u=l—-e u=l1 u=Il+e 


pw)=1 pQd)=1 pw=l1/u Z := plu) —z. 
Z 1 ZZ i 


A complete classification of such discontinuities is lacking (see, however, Mari- 
anne Durand’s thesis [181] for several interesting schemas), and is probably beyond 
reach given the vast diversity of situations to be encountered in a combinatorialist’s 
practice. We provide here two illustrations: the first example is relative to the classical 
theory of coin-tossing games (the arcsine distribution); the second one is relative to 
area under excursions and path length in trees (the Airy distribution of the area type). 
Both are revisited here under the perspective of phase transition diagrams, which pro- 
vide a useful way to approach and categorize non-Gaussian limits. 


Example YX.38.  Arcsine law for unbiased random walks. This problem is studied in detail 
by Feller [205, p. 94] who notes, regarding gains in coin-tossing games: “Contrary to intuition, 
the maximum accumulated gain is much more likely to occur towards the very beginning or the 
very end of a coin-tossing game than somewhere in the middle.” See Figure IX.18. 

We let x be the time of first occurrence of the maximum in a random game (that is, a 
walk with +1 steps) and write X, for the RV representing y restricted to the set Wy, of walks 
of duration n. The BGF W(z, uw), where u marks 7, results from the standard decomposition 
of positive walks. Essentially, there is a sequence of steps ascending to the (non-negative) 
maximum accompanied by “arches” (the left factor) followed by a mirror excursion back to 
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Figure IX.18. Histograms of the distribution of the location of the maximum of a 
random walk for n = 10..60 (left) and the density of the arcsine law (right). 


the maximum, followed by a sequence of descending steps with their companion arches. This 
construction translates directly into an equation satisfied by the BGF W(z, w) of the location of 
the first maximum 


(98) 


1 1 


Wee) = 1 — zuD(zu) : D(z) 1 — zD(z)’ 


which involves the GF of a gambler’s ruin sequences (equivalently Dyck excursions, Exam- 
ple IX.8, p. 635), namely, 


1— V1 —422 
2x? 
In such a simple case, explicit expressions are available from (98), when we expand first with 


respect to u, then to z. We obtain in this way the ultra-classical result that the probability that 


(99) DZ) = 


Xn equals either k = 2r ork = 2r + 1 is 5 U2-U2y—2ps where wu, := gov ear The usual 
approximation of central binomial coefficients, u2, ~ (xv)—!/2, followed by a summation 


then leads to the following statement. 


Proposition [X.21 (Arcsine law). For any x € (0, 1), the position Xp of the first maximum in 
a random walk of even length n satisfies a limit arcsine law: 


2, 
lim P,(Xp < xn) = — arcsin/x. 
n—> 00 1 


It is instructive to compare this to the way singularities evolve as u crosses the value 1. 
The dominant positive singularity is at p(u) = 1/2 ifu < 1, while p(u) = 1/(2u), ifu > 1. 
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Figure IX.19. A plot of 1/W(z, u) for z € [0.4, 0.55] when wu is assigned values 
between 1/2 and 5/4 (left); The exponent function a(u) (top right) and the singular 
value p(w) (bottom right), for uw € [0.5, 0.55]. 


Local expansions show that, with c<(u), cs (u) two computable functions, there holds: 


W(z,u) ~ c<(u) W(z,u) ~ cs (u) 


1 1 
V1 = 22’ JI = 220 
Naturally, at wu = 1, all sequences are counted and W(z, 1) = 1/(1 —2z). Thus, the correspond- 
ing phase-transition diagram is (see Figure [X.19): 


u=l-—e u=1 u=1+e 
plu) =1/2 p)=1/2 plu) =1/Qu) 
7-1/2 z-1 7-1/2 


The point to be made here is that the arcsine law could be expected when a similar phase- 
transition diagram occurs. There is indeed universality in this singular view of the arcsine law, 
which extends to walks with zero drift (Chapter VII). This analytic kind of universality is a 
parallel to the universality of Brownian motion, which is otherwise familiar to probabilists. Hl 


> IX.51. Number of maxima and other stories. The construction underlying (98) also serves 
to analyse; (i) the number of times the maximum is attained. (ii) the difference between the 
maximum and the final altitude of the walk; (iii) the duration of the period following the last 
occurrence of the maximum. dq 


Example YX.39. Path length in trees. A final example is the distribution of path length in trees, 
whose non-Gaussian limit law has been originally characterized by Louchard and Takacs [416, 
417, 567, 569]. The distribution is recognized not to be asymptotically Gaussian, as it is verified 
from a computation of the first few moments. In the case of general Catalan trees, the analysis 
is equivalent to that of area under Dyck paths (Examples V.9, p. 330, and VII.26, p. 533) and is 
closely related to our discussion of coin fountains and parallelogram polyomino models, earlier 


1. 
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in this chapter (p. 662). It reduces to that of the functional equation 


1 


F = —_____ 
(z, u) (=sFGaD: 


which determines F(z,u) as a formal continued fraction, and setting F(z,u) = 
A(z, u)/B(z, u), we found (p. 331) 
yin) zn 


(l—u) —u2)---(— ul)’ 


B(z,u) =1+ >0(-1)" 
n=1 


with a very similar expression for A(z, uv). Because of the quadratic exponent involved in the 
powers of uw, the function z + F(z, u) has radius of convergence 0 when u > 1, and is thus 
non-analytic. By contrast, when u < 1, the function z+» B(z, u) is an entire function, so that 
z+} F(z, u) is meromorphic. Hence the singularity diagram: 


u=l—e u=1 u=1+e 


pu)> p= 4 pu)=0 
7-1 zi/2 _ 


The limit law is the Airy distribution of the area type (244, 352, 416, 417, 567, 569], which 
we have encountered in Chapter VII, p. 533. By an analytical tour de force, Prellberg [496] 
has developed a method based on contour integral representations and coalescing saddle-points 
(Chapter VIII, p. 603) that permits us to make precise the phase transition diagram above and 
obtain uniform asymptotic expansions in terms of the Airy function. Since similar problems 
occur in relation to connectivity of random graphs under the Erdés—Rényi model [254], and 
conjecturally in self-avoiding walks (p. 363), future years might see more applications of Prell- 
beresssmethods sexes sak odds a va tne eee ene bbs Seed PAS eb Sle TGR eee GaN | 


IX. 11.2. Semi-large powers, critical compositions, and stable laws. We con- 
clude this section by a discussion of critical compositions that typically involve con- 
fluences of singularities and lead to a general class of continuous distributions closely 
related to stable laws of probability theory. We start with an example where every- 
thing is explicit, that of zero contacts in random bridges, then state a general theorem 
on “semi-large” powers of functions of singularity analysis type, and finally return to 
combinatorial applications, specifically trees and maps. 


Example YX.40. Zero-contacts in bridges. Consider once more fluctuations in coin tossings, 
and specifically bridges, corresponding to a conditioning of the game by the fact that the final 
gain is 0 (negative capitals are allowed). These are sequences of arbitrary positive or negative 
“arches”, and the number of arches in a bridge is exactly equal to the number of intermediate 
steps at which the capital is 0. From the arch decomposition, it is found that the ordinary BGF 
of bridges with z marking length and u marking zero-contacts is 


1 
as 1 — 2uz2 D(z) 
with D(z) as in (99), p. 705. Analysing this function is conveniently done by introducing 
F(z,u)=B (Sve “) = —— 
2 1—u(l—J1 =z) 
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The phase-transition diagram is then easily found to be: 


u=l-e€ u=l1 u=l1+e 
puy=1 p()=1 pw)=1-(-u!/ 
zi/2 z-1/2 7-1 


Thus, there are discontinuities, both in the location of the singularity and the exponent, but of a 
different type from that which gives rise to the arcsine law of random walks. 

The problem of the limit law is here easily solved since explicit expressions are provided 
by the Lagrange Inversion Theorem. One finds: 


e"] (1 - JT=2)" 
Eerie ee) 
n 


n n—-1 


(uk lz" F(z, u) 
(100) 


A random variable with density and distribution function given by 
2 2 
(101) r(x) = a e RSi-e 


is called a Rayleigh law. Then Stirling’s formula easily provides the following proposition. 


Proposition [X.22. The number Xn of zero-contacts of a random bridge of size 2n satisfies, as 
n — © a local limit law of the Rayleigh type: 


: = — tx? 14 
im P(Xn = xJ/n) = ; 


Jn. 


The explicit character of (100) makes the analysis transparently simple. .................. | 


> YX.52. The number of cyclic points in mappings. The number of cyclic points in mappings 
has exponential BGF (1 — wT (z))~!, with T the Cayley tree function. The singularity diagram 
is of the same form as in Example IX.40. Explicit forms are derived from Lagrange inversion: 
the limit law is again Rayleigh. This property extends to the number of cyclic points in a 
simple variety of mappings (e.g., mappings defined by a finite constraint on degrees, as in 
Example VII.10, p. 464): see [18, 175, 176]. 

Both Example IX.40 and Note [X.52 above exemplify the situation of an analytic 
composition scheme of the form (1 — wh(z))~! which is critical, since in each case h 
assumes value | at its singularity. Both can be treated elementarily since they involve 
powers that are amenable to Lagrange inversion, eventually resulting in a Rayleigh 
law. As we now explain, there is a family of functions that appear to play a universal 
role in problems sharing similar singular types. What follows is largely borrowed 
from an article by Banderier et al. [28]. 

We first introduce a function S that otherwise naturally surfaces in the study of 
stable'® distributions in probability theory. For any parameter A € (0,2), define the 
entire function 


18Ty probability theory, stable laws are defined as the possible limit laws of sums of independent 
identically distributed random variables. The function S is a trivial variant of the density of the stable law 
of index ; see Feller’s book [206, p. 581-583]. Valuable informations regarding stable laws may be found 
in the books by Breiman [93, Sec. 9.8], Durrett [182, Sec. 2.7], and Zolotarev [629]. 
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Figure IX.20. The S-functions for 1 = 0.1..0.8 (left; from bottom to top) and for 
2 = 1.2..1.9 (right; from top to bottom); the thicker curves represent the Rayleigh 
law (left, 2 = 1/2) and the Airy map law (right, 2 = 3/2). 


>xo ii ies sin(xkA) (0 <4’ <1) 


P(1+k) 
(102) S(x, A) = “el ee 
=D Itt ay Sinrk/2) (<<) 


The function S(x; 1/2) is a variant of the Rayleigh density (101). The function 
S(x; 3/2) constitutes the density of the “Airy map distribution” found in random maps 
as well as in other coalescence phenomena, as discussed below; see (109). 
Theorem IX.16 (Semi-large powers). The coefficient of z" in a power H(z)* of a A- 
continuable function H(z) with singular exponent 4 admits the following asymptotic 
estimates. 

(i) For0 <A <1, that is, H(z) =o —h,(1 — z/p)* + OU — z/p), and when 
k = xn’, with x in any compact subinterval of (0, +00), there holds 


(103) [c"]H*(z) ~ otpnes (=, i), 
n oO 
(ii) For1 <4 <2, that is, H(z) =o —hi(1—z/p)+hgU —z/p* +0(— 


z/p)*), when k = mig + xn!/*, with x in any compact subinterval of (—0o, +00), 
there holds 


141/A 
xh, 
(104) [2"]H*(z) ~ ofp ry (hi/ha is a “). 
on, 


(iii) For A > 2, a Gaussian approximation holds. In particular, for 2 < 2 < 3, 
that is, H(z) = 6 — hy(1 — z/p) + ha - z/p)’ —ha( —z/p)* + O( -z/p)), 
when k = mg +x./n, with x in any compact subinterval of (—co, +00), there holds 
k Se ofhi gr pe 


(105) [z"]H*(z) ~ ofp Thala 


with a = 2(92 — $L)0?/hi. 
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The term “semi-large” refers to the fact that the exponents k in case (i) are of the 
form O(n®) for some @ < 1 chosen in accordance with the region where an “interest- 
ing” renormalization takes place and dependent on each particular singular exponent. 
When the interesting region reaches the O(n) range in case (iii), the analysis of large 
powers, as detailed in Chapter VIII (p. 591), takes over and Gaussian forms result. 


Proof. The proofs are somewhat similar to the basic ones in singularity analysis, but 
they require a suitable adjustment of the geometry of the Hankel contour and of the 
corresponding dimensioning. 

Case (i). A classical Hankel contour, with the change of variable z = p(1—t/n), 
yields the approximation 


ky-n hyx i} 
(erat) ~ SP fre a 
2imn 


hyx 
o 


The integral is then simply estimated by expanding exp(—“2*t’) and integrating 


termwise 


k —n k k 
pon oy OR (—x)" (ha | 
(106) [2"]H"(z) 7 pa kl (=) RAD: 


which is equivalent to Equation (103), by virtue of the complement formula for the 
Gamma function. 

Case (ii). When 1 < 4 < 2, the contour of integration in the z-plane is chosen 
to be a positively oriented loop, made of two rays of angle z/(24) and —z/(2A) that 
intersect on the real axis at a distance 1/n!/? left of the singularity. The coefficient 
integral of H* is rescaled by setting z = p(l — t/nMA), and one has 


op" a. = hg 
- ~ fe" e@ 

2innt/4 
There, the contour of integration in the t-plane comprises two rays of angle z// and 
—z/d, intersecting at —1. Setting u = t*h,/hy, the contour transforms into a clas- 
sical Hankel contour, starting from —oo over the real axis, winding about the origin, 
and returning to —oo. So, with a = 1/A, one has 


[2"]H*(z) ~ - dt. 


a+l 
nyzzk ay. okp” hy ‘i u ue al 
[z"]H*(z) eT: a (>) Je e 4 ue" du. 
Expanding the exponential, integrating termwise, and appealing to the complement 
formula for the Gamma function finally reduces this last form to (104). 

Case (iii). This case is only included here for comparison purposes, but, as 
recalled before the proof, it is essentially implied by the developments of Chapter VIII 
based on the saddle-point method. When 2 < 1 < 3, the angle ¢ of the contour of 
integration in the z—plane is chosen to be z /2, and the scaling is ,/n: under the change 
of variable z = p(1 — t/./n), the contour is transformed into two rays of angle 2/2 
and —z//2 (i.e., a vertical line), intersecting at —1, and 


k—n 
oP Py 
["]H*@) ~ -= fe * ‘dt, 
2in/n 


IX. 11. NON-GAUSSIAN CONTINUOUS LIMITS 711 


with p = p = a Complementing the square, and letting u = t — we get 


kyon ht oo 

oOo -p —z- 5x u2 

[2"]H*(z) ~~ — e 4po Je du, 
Qin J/n 


which gives Equation (105). By similar means, such a Gaussian approximation can 
be shown to hold for any non-integral singular exponent 4 > 2. | 


> IX.53. Zipf distributions. Zipf’s law, named after the Harvard linguistic professor George 
Kingsley Zipf (1902-1950), is the observation that, in a language like English, the frequency 
with which a word occurs is roughly inversely proportional to its rank—the kth most frequent 
word has frequency proportional to 1/k. The generalized Zipf distribution of parameter a > 1 
is the distribution of a random variable Z such that 


1 1 
C(a) ko 
It has infinite mean for a < 2 and infinite variance for a < 3. It was proved in Chapter VI 
(p. 408) that polylogarithms are amenable to singularity analysis. Consequently, the sum of a 


large number of independent Zipf variables satisfies a local limit law of the stable type with 
index a — 1 (fora # 2). 


P(Z=k)= 


Example YX.41. Mean level profiles in simple varieties of trees. Consider the RV equal to 
the depth of a random node in a random tree taken from a simple variety )V that satisfies the 
smooth inverse-function schema (Definition VIL3, p. 453). The problem of quantifying the 
corresponding distribution is equivalent to that of determining the mean level profile, that is 
the sequence of numbers M,, , representing the mean number of nodes at distance k from the 
root. (Indeed, the probability that a random node lies at level k is My, /n.) The first few levels 
have been characterized in Example VII.7 (p. 458) and the analysis of Chapter VII can now be 
completed thanks to Theorem IX.16. (The problem was solved by Meir and Moon [435] in an 
important article that launched the analytic study of simple varieties of trees. Meir and Moon 
base their analysis on a Lagrangean change of variable and on the saddle-point method, along 
the lines of our remarks in Chapter VIII, p. 590.) As usual, we let é(w) be the generator of the 
simple variety V, with Y(z) satisfying Y = zf(Y), and we designate by rt the positive root of 
the characteristic equation: 
to'(t) — o(t) =0. 

It is known from Theorem VII.3 (p. 468) that the GF Y(z) has a square root singularity at p = 
t/(z). For convenience, we also assume aperiodicity of ¢. Meir and Moon’s major result 
(Theorem 4.3 of [435]) is as follows 


Proposition IX.23 (Mean level profiles). The mean profile of a large tree in a simple variety 
obeys a Rayleigh law in the asymptotic limit: for k/./n in any bounded interval of Ryo, the 
mean number of nodes at altitude k satisfies asymptotically 


2 
Mik ay Ake 4* /Qn) | 


where A = th" (t). 


The proof goes as follows. For each k, define Y;(z, u) to be the BGF with u marking the 
number of nodes at depth k. Then, the root decomposition of trees translates into the recurrence: 


Yx(z, u) = zh(Ve-1, 4), Yo(z, u) = zub(Y(z)) = u¥(z). 


By construction, we have 


1 0 
Mik = y-'1 (S16. ») 


u=1 
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On the other hand, the fundamental recurrence yields 
é / k 
5 tke) = (zP'(Y(z)))” Y(@). 
u u=1 


Now, ¢’(Y) has, like Y, a square-root singularity. The semi-large powers theorem applies 
with A = 5 and the result follows. 0.0.0.0... ccc ccc cc ccc cece ee eee cece eee e ee eeees |_| 


> IX.54. The width of trees. The expectation of the width W of a tree in a simple variety 


satisfies 
CiVn < Ey, (W) < CoVnlogn, 


for some C,, Cz > 0. (This is due to Odlyzko and Wilf [463], a possible approach consisting in 
suitably bounding the level profile of random trees. Better bounds are known, now that W,/./n 
has been recognized to be related to Brownian excursion. In particular, the expected width is 
~ c/n; see Example V.17, p. 359 and the references there.) 


<J 


Critical compositions. Theorem IX.16 provides useful information on composi- 

tions of the form 
F(z,u) = GA (z)), 
provided G(z) and H(z) are of singularity analysis class. As we know, combinatori- 
ally, this represents a substitution between structures, F = G o H, and the coefficient 
[<"u*|F (z, w) counts the number of M-structures of size n whose G-component, also 
called core in what follows, has size k. Then the probability distribution of core-size 
X, in F-structures of size n is given by 
k 
PX, =) = SO ema, 
[c"]G(H(z)) 

The case where the schema is critical, in the sense that H(ry) = rg with ry,rg 
the radii of convergence of H, G, follows as a direct consequence of Theorem IX. 16. 
What comes out is the following informally stated general principle (details would 
closely mimic the statement of Theorem IX.16 and are omitted: see [28]). 


Proposition I[X.24 (Critical compositions). Jn a composition schema F(z,u) = 
G(uH (z)) where H and G have singular exponents 1, 4' with 4! < A: 

(i) for 0 <A <1, the normalized core-size X,,/n? is spread over (0, +00) and it 
satisfies a local limit law whose density involves a stable law of index 4; in particular, 
A= 5 corresponds to a Rayleigh law. 

(ii) for 1 < 4 < 2, the distribution of Xp is bimodal and the “large region” 
Xn =cn+ xn!/4 involves a stable law of index 4; 

(iii) for 2 < A, the standardized version of Xn admits a local limit law that is of 
Gaussian type. 


Similar phenomena occur when 4’ > A, but with a greater preponderance of 
the “small” region. Many instances have already appeared scattered in the literature. 
especially in connection with rooted trees. For instance, this proposition explains well 
the occurrence of the Rayleigh law (A = 1/2) as the distribution of cyclic points 
in random mappings and of zero-contacts in random bridges. The case 2 = 3/2 
appears in forests of unrooted trees (see the discussion in Chapter VIII, p. 603, for an 
alternative approach based on coalescing saddle-points) and it is ubiquitous in planar 
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maps, as attested by the article of Banderier et al. on which this subsection is largely 
based [28]. We detail one of the cases in the following example, which explains the 
meaning of the term “large region” in Proposition IX.24. 


Example YX.42. Biconnected cores of planar maps. The OGF of rooted planar maps, with size 
determined by the number of edges, is, by Subsection VII. 8.2 (p. 513), 


aise din: iniigs nae 8A 18 
(107) M(z) = aa () 182— 1 — 122) iF 


with a characteristic 3/2 exponent. Define a separating vertex or articulation point in a map 
to be a vertex whose removal disconnects the graph. Let C denote the class of non-separable 
maps, that is, maps without an articulation point (also known as biconnected maps). Starting 
from the root edge, any map decomposes into a non-separable map, called the “core” on which 
are grafted arbitrary maps, as illustrated by the following diagram: 


There results the equation: 


(108) M()=C(H@)), — H@) =z. + MG). 
Since we know M, hence H, this last relation gives by inversion the OGF of non-separable 
maps as an algebraic function of degree 3 specified implicitly by the equation 
C3 4+2C7 + (1 — 18z)C + 272" — 2z =0, 

with expansion at the origin (EJS A000139): 

(3k)! 
(K+ 1)!(2k +1)! 
(The closed form results from a Lagrangean parameterization.) Now the singularity of C is also 


of the Z3/2 type as seen by inversion of (108) or from the Newton diagram attached to the cubic 
equation. We find in particular 


CW) = 2242423464422 491°H---, Cer =2 


C(z) = ; Sy s(l = 9797/4) + a — 27z/4)>/* + O((1 — 27z/4)), 


which is reflected by the asymptotic estimate, 


The parameter considered here is the distribution of the size X, of the core (containing 
the root) in a random map of size n. The composition relation is M = C o H, where H = 
Z(1+ My). The BGF is thus M(z, uw) = C(uH(z)) where the composition C o H is of the 
singular type Z3/2 o Z3/2. What is peculiar here is the “bimodal” character of the distribution 
of core-size (see Figure [X.21 borrowed from [28]), which we now detail. 
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Figure IX.21. Left: The standard “Airy map distribution”. Right: Observed frequen- 
cies of core-sizes k € [20; 1000] in 50000 random maps of size 2000, showing the 
bimodal character of the distribution. 


First straight singularity analysis shows that, for fixed k, 


[2"]H(z)* 
k My n—->oo 


P(X, =H =C kCyhi |, 

where hg = 4/27 is the value of H(z) at its singularity. In other words, there is local con- 
vergence of the probabilities to a fixed discrete law. The estimate above can be proved to 
remain uniform as long as k tends to infinity sufficiently slowly. We shall call this the “small 
range” of k values. Now, summing the probabilities associated to this small range gives the 
value C(hg) = 1/3. Thus, one-third of the probability mass of core-size arises from the small 
range, where a discrete limit law is observed. 

The other part of the distribution constitutes the “large range” to which Theorem IX.16 
applies. It contains asymptotically 2/3 of the probability mass of the distribution of X,,. In that 
case, the limit law is related to a stable distribution with density S(x; 3/2) and is also known as 
the “Airy map” distribution: one finds for k = qn + xn?/3, the local limit approximation: 


(109) P(X,»=)~— A (G25) 5 A(x) = 207273 (« Ai(x2) — Ai!(x)) 
3n2i3 NA 
There Ai(x) is the Airy function (defined in the footnote on p. 534) and A(x) specifies the Airy 
map distribution displayed in Figure IX.21. 
The bimodal character of the distribution of core-sizes can now be better understood [28]. 
A random map decomposes into biconnected components and the largest biconnected compo- 
nent has, with high probability, a size that is O(n). There are also a large number (O(n)) of 
“dangling” biconnected components. In a rooted map, the root is in a sense placed “at random”. 
Then, with a fixed probability, it either lies in the large component (in which case, the distri- 
bution of that large component is observed, this is the continuous part of the distribution given 
by the Airy map law), or else one of the small components is picked up by the root (this is the 
discrete part of the distribution). 0.2.0... 0... cece cece cece cence eee ee nee n tenn eene | 


> IX.55. Critical cycles. The theory adapts to logarithmic factors. For instance the critical 
composition F(z, vu) = —log(1 — ug(z)) leads to developments similar to those of the critical 
sequence. In this way, it becomes possible for instance to analyse the number of cyclic points 
in a random connected mapping. 


> IX.56. The base of supertrees. Supertrees defined in Chapter VI (p. 412) are trees grafted on 
trees. Consider the bicoloured variant K = G(2ZG), with G the class of general Catalan trees. 
Then, the law of the external G—-component is related to a stable law. <J 
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IX. 12. Multivariate limit laws 


Combinatorics can take advantage of the enumeration of objects with respect to a 
whole collection of parameters. The symbolic methods of Part A are well suited and 
we have seen in Chapter HI ways to solve problems like: how many permutations are 
there of size n with n; singleton cycles and n2 cycles of length 2? In combinatorial 
terms we are seeking information about a multivariate (rather than plainly bivariate) 
sequence, say Fy,k,,k,- In probabilistic terms, we aim at characterizing the joint dis- 
tribution, say (X . x @)). of a family of random variables. Methods developed in 
this chapter adapt well to multivariate situations. Typically, there exist natural exten- 
sions of continuity theorems, both for PGFs and for integral transforms and the most 
abstract aspects of the foregoing discussion regarding central and local limit laws as 
well as tail estimates and large deviations can be recycled. 

Consider for instance the joint distribution of the numbers 71, v2 of singletons 
and doubletons in random permutations. Then, the parameter 7 = (71, 72) has a 
trivariate EGF 


exp((uy — 1)z + (uz — 1)z?/2) 
1-z 


F(Z, U1, u2) = 


Thus, the bivariate PGF satisfies, by meromorphic analysis, 


Pn(u1, 42) = [2"] F(z, ui, u2) ~ 7) EDP, 


uniformly when the pair (uw, u2) ranges over a compact set of C x C. As a result, the 
joint distribution of (v1, 72) is a product of a Poisson(1) and a Poisson(1/2) distribu- 
tion; in particular y; and y2 are asymptotically independent. 

Consider next the joint distribution of y = (71, 72), where yj; is the number 
of summands equal to j in a random integer composition. Each parameter individu- 
ally obeys a limit Gaussian law, since the sequence construction is supercritical. The 
trivariate GF is 

1 


1— zd —z)7! — (wy — 1)z — (v2 — 1) 2? 
By meromorphic analysis, a higher dimensional quasi-power approximation may be 
derived: 


F(z, uj,u2) = 


[2"] F(z, 41,42) ~ clu, u2)p (U1, U2), 

for some third-degree algebraic function p(u;, uz). In such cases, multivariate ver- 
sions of the continuity theorem for integral transforms can be applied. (See the book 
by Gnedenko and Kolmogorov [294] and especially the treatment of Bender and Rich- 
mond in [44].) As a result, the joint distribution is, in the asymptotic limit, a bivariate 
Gaussian distribution with a covariance matrix that is computable from p(u1, u2). 
Such generalizations are typical and involve essentially no radically new concept, just 
natural technical adaptations. 


A highly interesting approach to multivariate problems is that of functional limit 
theorems. The goal is now to characterize the joint distribution of an unbounded 
collection of parameters. The limit process is then a stochastic process, essentially an 
object that lives in some infinite-dimensional space. For instance, the joint distribution 
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of all altitudes in random walks is accounted for by Brownian motion. The joint distri- 
bution of all cycle lengths in random permutations is described explicitly by Cauchy’s 
formula (p. 188) and DeLaurentis and Pittel [149] have shown a convergence to the 
standard Brownian motion process, after a suitable renormalization. A rather spectac- 
ular application of this circle of ideas was provided in 1977 by Logan, Shepp, Vershik 
and Kerov [411, 596]. These authors established that the shape of the pair of Young 
tableaux associated to a random permutation conforms, in the asymptotic limit and 
with high probability, to a deterministic trajectory defined as the solution to a varia- 
tional problem. In particular, the width of a Young tableau associated to a permutation 
gives the length of the longest increasing sequence of the permutation. By special- 
izing their results, the authors were then able to show that the expected length in a 
random permutation of size n is asymptotic to 2./n, a long-standing conjecture at the 
time (see also our remarks on p. 597 for subsequent developments). There is currently 
a flurry of activity on these questions, with methods ranging from purely probabilistic 
to purely analytic. 

Among extensions of the standard approach presented in this book to analytic 
combinatorics, we single out a few, which seem especially exciting. Lalley [397] has 
extended the framework of the important Drmota—Lalley—Woods Theorem (p. 489) 
to certain infinite systems of equations, by appealing to Banach space theory—this 
has applications in the theory of random walks on groups. Vallée and coauthors (see 
Note IX.32, p. 664, and the survey [584]) have developed a broad theory based on 
transfer operators from dynamical systems theory, where generating operators replace 
generating functions and operate on certain infinite dimensional functional spaces— 
there are surprising applications both in information theory and in analytic number 
theory (e.g., the analysis of Euclidean algorithms). McKay [432] has shown how 
to extend the one-dimensional saddle-point theory presented in Chapter VIII in a 
highly non-trivial way in order to treat certain counting problems where a problem 
of size n is represented by a d(n)-dimensional integral, with d(n) tending to infinity 
with n—this is especially important since a great many hard combinatorial problems 
can be represented in this manner, including for instance the celebrated random SAT- 
problem [77, 486]. 

We hope that the fairly complete treatment of standard aspects of the theory of- 
fered in this book will help our reader to master and enrich a field, which is extremely 
vast, blooming, and pregnant with fascinating problems at the crossroads of discrete 
and continuous mathematics. 


IX. 13. Perspective 


The study of parameters of combinatorial structures ideally culminates in an un- 
derstanding of the distribution of the parameter’s values, typically under the assump- 
tion that each instance of a given size in a combinatorial class appears with equal 
likelihood. 

First, as we have already seen in Chapter III, we can extend the basic combi- 
natorial constructions of Chapters I and II to include bivariate generating functions 
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(BGFs) whose second variable carries information about the parameter. Our combi- 
natorial constructions then provide a systematic way to develop succinct BGFs for a 
broad range of combinatorial classes and parameters, which are of interest in combi- 
natorics, computer science, and other applied sciences. 

Next, the various methods considered in Chapters IV—VIII (Part B) of this book 
can be extended to develop asymptotic results for BGFs by studying slight perturba- 
tions of the singularities, controlled by the second variable. The uniform precision of 
the asymptotic results that we develop in Part B is a critical component in our ability 
to do this, by contrast with other classical methods for coefficient asymptotics (Dar- 
boux’s method and Tauberian theorems) which are, to a large extent, non-constructive. 

These asymptotic results take the form of limit laws: the distribution governing 
the behaviour of parameters converge to a fixed discrete distribution, or appropriately 
scaled, to a continuous distribution. Whereas BGFs are purely formal objects, to de- 
termine whether the distribution is discrete or continuous requires analysis of them as 
functions of complex variables. In a preponderance of cases, the limit laws say that 
parameter values approach a single distribution, the well-known Gaussian (normal) 
distribution. The well-known central limit theorem is but one example (not the ex- 
planation) of this phenomenon, whose breadth is truly remarkable. For example, we 
have encountered numerous examples where the occurrence of a given fixed pattern in 
a large random object is almost certain, with the number of occurrences governed by 
Gaussian fluctuations. This property holds true for strings, uniform tree models, and 
increasing trees. The associated BGFs are rational functions, algebraic functions, and 
solutions to nonlinear differential equations, respectively, but the approach of extend- 
ing the methods of Part B to study local perturbations of singularities is effective in 
each case—the proofs eventually reduce to establishing an extremely simple property, 
a singularity that smoothly moves. 

Such studies are an appropriate conclusion to this book, because they illustrate the 
power of analytic combinatorics. We are able to use formal methods to develop suc- 
cinct formal objects that encapsulate the combinatorial structure (BGFs), then, treat- 
ing those BGFs as objects of analysis (functions of one, then two complex variables) 
we are able to obtain wide encompassing asymptotic information about the original 
combinatorial structure. Such an approach has serendipitous consequences. Combi- 
natorial problems can then be organized into broad schemas, covering infinitely many 
combinatorial types and governed by simple asymptotic laws—the discovery of such 
schemas and of the associated universality properties constitutes the very essence of 
analytic combinatorics. 


Bibliographic notes. This chapter is primarily inspired by the studies of Bender and Rich- 
mond [35, 44, 46], Canfield [101], Flajolet, Soria, and Drmota [171, 172, 175, 176, 258, 260, 
547] as well as Hwang [337, 338, 339, 340]. Bender’s seminal study [35] initiated the study 
of bivariate analytic schemes that lead to Gaussian laws and the paper [35] may rightly be con- 
sidered to be at the origin of the field. Canfield [101], building upon earlier studies showed the 
approach to extend to saddle-point schemas. 

Tangible progress was next made possible by the development of the singularity analysis 
method [248]. Earlier research was mostly restricted to methods based on subtraction of sin- 
gularities, as in [35], which is in particular effective for meromorphic cases. The extension to 
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algebraic—logarithmic singularities was, however, difficult given that the classical method of 
Darboux does not provide for uniform error terms. In contrast, singularity analysis does ap- 
ply to classes of analytic functions, since it allows for uniformity of estimates. The papers by 
Flajolet and Soria [258, 260] were the first to make clear the impact of singularity analysis on 
bivariate asymptotics. Gao and Richmond [277] were then able to extend the theory to cases 
where both a singularity and its singular exponent are allowed to vary. 

From there, Soria developed the framework of schemas considerably in her doctor- 
ate [547]. Hwang extracted the very important concept of “quasi-powers” in his thesis [337] 
together with a wealth of properties such as full asymptotic expansions, speed of convergence, 
and large deviations. Drmota established general existence conditions leading to Gaussian laws 
in the case of implicit, especially algebraic, functions [171, 172]. The “singularity perturbation” 
framework for solutions of linear differential equations first appears under that name in [243]. 
Finally, the books by Sachkov, see [525] and especially [526] (based on the 1978 edition [524]) 
offer a modern perspective on bivariate asymptotics applied to classical combinatorial struc- 
tures. 


DIED MYY IAIN Ia NAAN AM 
wa nyy nai and yp px nai 
(“But beyond this, my son, be warned: the writing of many books 


is endless; and excessive devotion to books is wearying to the body.”)) 


— Tanakh (The Bible), Qohelet (Ecclesiastes) 12:12. 


Part D 


APPENDICES 


APPENDIX A 


Auxiliary Elementary Notions 


We combine in the three appendices definitions and theorems related to key mathematical con- 
cepts not covered directly in the text. Generally, the entries in the appendices are independent, 
intended for reference while addressing the main text. Our own Introduction to the Analy- 
sis of Algorithms [538] is a gentle introduction to many of the concepts underlying analytic 
combinatorics at a level accessible to any college student and is reasonable preparation for un- 
dergraduates or anyone undertaking to read this book for self-study. 


This appendix contains entries that are arranged in alphabetical order, regarding the fol- 
lowing topics: 
Arithmetical functions; Asymptotic notations; Combinatorial probability; Cycle 
construction; Formal power series; Lagrange inversion; Regular languages; Stir- 
ling numbers; Tree concepts. 


The corresponding notions and results are used throughout the book, and especially in Part A 
relative to Symbolic Methods. Accessible introductions to the subject of this appendix are the 
books by Graham—Knuth-Patashnik [307], and Wilf [608], regarding combinatorial enumer- 
ation, and De Bruijn’s vivid booklet [142], regarding asymptotic analysis. Reference works 
in combinatorial analysis are the books by Comtet [129], Goulden—Jackson [303], and Stan- 
ley [552, 554]. 


A.1. Arithmetical functions 


A general reference for this section is Apostol’s book [16]. First, the Euler totient 
function (k) intervenes in the unlabelled cycle construction (pp. 27, 84, 165, as well 
as 729 below). It is defined as the number of integers in [1 ..k] that are relatively 
prime to k. Thus, one has g(p) = p — 1 if p € {2,3,5,...} is a prime. More 
generally when the prime number decomposition of k is k = Py -++ py", then 


=] = 
g(k) = pt! (pi — V+ pe (pp = 1). 


A number is squarefree if it is not divisible by the square of a prime. The Mébius 
function y(n) is defined to be 0 if n is not squarefree and otherwise is (—1)’ ifn = 
Pi--+- py is a product of r distinct primes. 

Many elementary properties of arithmetical functions are easily established by 
means of a Dirichlet generating functions (DGF). Let (an)n>1 be a sequence; its DGF 
is formally defined by 


In particular, the DGF of the sequence a, = 1 is the Riemann zeta function, ¢(s) = 
ast n—*. The fact that every number uniquely decomposes into primes is reflected 


721 
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by Euler’s formula, 


i ee 
(1) co=T](1-<) . 


peP 


where p ranges over the set P of all primes. (As observed by Euler, the fact that 
¢(1) = o in conjunction with (1) provides a simple analytic proof that there are 
infinitely many primes! See Note IV.1, p. 228) 

Equation (1) implies that the DGF of the M6bius function satisfies 


(2) M(s) => -T1(-S)-% 


n>1 peP 


(Verification: expand the infinite product and collect the coefficient of 1/n*.) 
Finally, if (an), (bn), (Cn) have DGF a(s), f(s), y (s), then one has the equiva- 
lence 


a(s) = B(s)y(s) <=> an = Do bacnya- 
d|n 
In particular, taking cn = 1 (y (s) = C(s)) and solving for f(s) shows (using (2)) the 
implication 
(3) = be So by => Oana: 
d\n d|n 


which is known as Moébius inversion. This relation is used in the enumeration of 
irreducible polynomials (Section I. 6.3, p. 88). 


A.2. Asymptotic notations 


Let S be a set and so € S a particular element of S. We assume a notion of 
neighbourhood to exist on S. Examples are S = Zs U {+00} with so = +00,S=R 
with so any point in R; S = C or a subset of C with sq = 0, and so on. Two functions 
¢ and g from S \ {so} to R or C are given. 


— O-notation: write 


#6) = O(()) 


if the ratio (s)/g(s) stays bounded as s > sg in S. In other words, there 
exists a neighbourhood Y of so and a constant C > 0 such that 


Ip(s)l < Cle), seV, s#50. 


One also says that “ is of order at most g”, or “f is big—Oh of g” (as s 
tends to sg). 
— ~-notation: write 


#6) >, 86) 


if the ratio (s)/g(s) tends to 1 as s > so in S. One also says that “¢ and 
g are asymptotically equivalent” (as s tends to sq). 
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— o-notation: write 


p(s) = o(g(s)) 


Ss SO 


if the ratio (s)/g(s) tends to 0 as s > so in S. In other words, for any 
(arbitrarily small) ¢ > 0, there exists a neighbourhood Y, of sq (depending 
on €), such that 


Ig) sels, seve, s #50. 
One also says that “¢ is of order smaller than g, or ¢ is little-oh of g” (as s 
tends to so). 


These notations are due to Bachmann and Landau towards the end of the nineteenth 
century. See Knuth’s note for a historical discussion [381, Ch. 4]. 
Related notations, of which, however, we only make a scant use, are 


— Q-notation: write 


#63) = Qg6)) 


if the ratio #(s)/g(s) stays bounded from below in modulus by a non-zero 
quantity, as s > so in S. One then says that ¢ is of order at least g. 
— ©-notation: if d(s) = O(g(s)) and o(s) = Q(g(s)), write 


#(s) = O(86)). 


One then says that ¢ is of order exactly g. 
For instance, one has asn > +00 in Zy0: 
sinn =o(logn); logn=O(/n); logn = o0(./n); 
(5) =QanV/n);  xzn+J/n = O(n). 


As x > 1 in Rey, one has 
Vl—-—x=o0(1); e* =O(sinx); logx = O(x — 1). 


We take as granted in this book the elementary asymptotic calculus with such 
notations (see, e.g., [538, Ch. 4] for a smooth introduction close to the needs of an- 
alytic combinatorics and de Bruijn’s classic [143] for a beautiful presentation.). We 
shall retain here in particular the fact that Taylor expansions (Note A.6, p. 726) imply 
asymptotic expansions; for instance, the convergent expansions, all valid for |u| < 1, 

CO 


ke] 
log +u) = DS — ve exp(u) = S a (l-uy%= > (‘ ? ‘ - ‘De 


k=1 k>0 k>0 


imply (as u > 0) 


2 ue 3 1/2 u 2 
log +u) =u+ Olu’), exp) =1ltut+ > + OU ) Ud-un) ie A. ) 


and so forth. Consequently, as n + +00, one has: 


1 1+ ; : +0 : 1 Ly 1 : = ; 
oO = _ == . 
. n n n2 J? logn 2logn " logn 
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Two important asymptotic expansions are Stirling’s formula for factorials and the 
harmonic number approximation, valid for n > 1, 


nt = ne "VJ2an(1+e,), O<e < Tn 
(4) 1 1 es : 
H, = logn+y +5 — soa tin ij Oh ).o 9 SO STIZL 


that are commonly established as consequences of the Euler-Maclaurin summation 
formula that relates sums to integrals (see Note A.7, p. 726, references [143, 538], as 
well as Appendix B.7: Mellin transform, p. 762). 


> A.1. Simplification rules for the asymptotic calculus. Some of them are 


Of) =F, OU) (#0) 
O(f)+O(g) — O(fl+lgl) 
— O(f) ifg = O(f) 
O(f -8) — O(f)O(sg). 
Similar rules apply for o(-). dq 


Asymptotic scales. An important notion due to Poincaré is that of an asymptotic 
scale. A sequence of functions wo, @1,... is said to constitute an asymptotic scale if 
all functions @; exist in a common neighbourhood of so € S and if they satisfy there, 
for all j > 0: 


aj+1(8) = o(@j(s)), ie., ans oa = 
Examples at 0 are the scales: uj(x) = x4; b2j(x) = x/ logx and v2j41(%) = x; 
w(x) = x//?_ Examples at infinity are tj(a) = n—/, and so on. Given a scale 
® = (@;(s));>0, a function f is said to admit an asymptotic expansion in the scale ® 
if there exists a family of complex coefficients (A;) (the family is then necessarily 
unique) such that, for each integer m: 


(5) f(s) = Do Ajaj(s) + O(msi(s)) (8 50). 


j=0 


In this case, one writes 
CO 

(6) f)~ Ajai), (s > 50) 
j=0 


with an extension of the symbol ‘~’. (Some authors prefer the notation “~”’, but in 
this book, we reserve it to mean informally “approximately equal” or “of the rough 
form’’.) 
The scale may be finite and in most cases, we do not need to specify it as it is 
clear from context. For instance, one can write 
1 13; 2 5 
Hn ~ logn+y +75. ah eG ee 
In the first case, it is understood that n > oo and the scale is logn, 1, fe ae: 
In the second case, x > 0 and the scale is x,x3,x°,.... Note carefully that in the 
case of a complete expansion (6), convergence of the infinite sum is not in any way 
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implied: the relation is to be interpreted literally, in the sense of (5); namely, as a 
collection of more and more precise descriptions of f when s becomes closer and 
closer to so. (As a matter of fact, almost all the asymptotic expansions of number 
sequences developed in this book, starting with Stirling’s formula, are divergent.) 


> A.2. Harmonics of harmonics. The harmonic numbers are readily extended to non-integral 
index by (cf also the y function p. 746) 


For instance, Hj ;2 = 2 — 2 log 2. This extension is related to the Gamma function [604], and it 
can be proved that the asymptotic estimate (4), with x replacing n, remains valid as x 4 +00. 
A typical asymptotic calculation shows that 


yts 1 
Hy,, = loglogn+y + +0 5) . 
logn log*n 


What is the shape of an asymptotic expansion of Hy, ? dq 


> A.3. Stackings of dominos. A stock of dominos of length 2cm is given. It is well known that 
one can stack up dominos in a harmonic mode: 


1 


Estimate within 1% the minimal number of dominos needed to achieve a horizontal span of 


1m (=100cm). (Hint: about 1.50926 1043 dominos!) Set up a scheme to evaluate this integer 
exactly, and do it! dq 


> A.4. High precision fraud. Why is it that, to forty decimal places, one finds 


500.000 (-1)*-! 
4 DF ay =  3.141590653589793240462643383269502884 197 
ea 
1 =  3.141592653589793238462643383279502884197, 


with only four “wrong” digits in the first sum? (Hint: consider the simpler problem 
1 
5801 = 0.0001 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 2021 2223 2425.-- .) 

Many fascinating facts of this kind are found in works by Jon and Peter Borwein [79, 80]. < 

Uniform asymptotic expansions. The notions previously introduced allow for 
uniform versions in the case of families dependent on a secondary parameter [143, 
pp. 7-9]. Let {fu(s)}ueu be a family of functions indexed by U. An asymptotic 
equivalence like 

fuls) = O(8(s)) — (S > 80), 

is said to be uniform with respect to u if there exists an absolute constant K (indepen- 
dent of u € U) and a fixed neighbourhood Y of so such that 


WweuU,VseEyV: | fu(s)| < Klg(s)I. 
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This definition in turn gives rise to the notion of a uniform asymptotic expansion: it 
suffices that, for each m, the O error term in (5) be uniform. Such notions are central 
for the determination of limit laws in Chapter IX, where a uniform expansion of a 
class of generating functions near a singularity is usually required. 

> A.5. Examples of uniform asymptotics. One has uniformly, for u € R and u € [0, 1] respec- 


tively: 
‘ 1\" u 1 
sinux) = O(1), se ie je = )- 


n 


However, the second expansion no longer holds uniformly with respect to u when u € R (take 
u = +n), although it holds aia (non-uniformly) for any fixed wu € R. What about the 


1\4 uz 
assertion (1 + *) = 1 += + o(5 7) e R? <J 
n noo 


> A.6. Taylor expansions. Let (¢x) be a sequence of polynomials such that ¢9 = 1 and 
py = x, for all k > 0. A repeated use of integration by parts shows that, for a function f 


assumed to be sufficiently smooth, one has ([/] : denotes the variation h(B) — h(A)) 
1 
[ fdoat = [Foi]y—[Fo2]g +--+ CD [FO Pd] 


(7) 1 
eee y. f° (0) bm(t) dt. 


Choosing $x (t) = (t — 1)*/k! i the basic Taylor expansion with remainder: 


COI) 
(8) is fat = S i+ can FOV" dr. 


If |f (™) (p) is less than m!A~™ for some A > 1, then a convergent representation follows. 
Setting f(t) = xg’ (xt) then yields the classical Taylor expansion with remainder 


m k 1 x 
0) 8) = DOD + fee — ar, 
DO oe 


and a convergent infinite series can be deduced under suitable growth assumptions on the deriva- 
tives of g. (Complex analytic methods of Chapter IV and Appendix B develop a powerful theory 
by which one can avoid explicitly determining and bounding derivatives.) dq 


> A.7. Euler-Maclaurin summation. Choose now ¢x(t) = [z"]ze’=/(e — 1). The dx are, up 
to normalization, Bernoulli polynomials and their coefficients involve the Bernoulli numbers 
(p. 268): go(t) = 1, 0) (1) =t - 5 o2(t) = 7/2: —t/2+ 1/12, and so on. Equation (7) then 


yields the basic Euler-Maclaurin expansion with remainder: 


FO) + FQ) _ >> Bax 
froa-P> - > apie hs + [Fe erdam (nat. 


From here, a formula results by summation (with {x} := x — |x]), which serves to compare 
sums and integrals: 


n-1 


M n 
[roa= ey te ora M6 + [eM erdam (ed ae. 


The asymptotic expansions of es p. 724, can finally be developed: use f(t) = log(t + 1) 
and f(t) = 1/(t + 1). (Hint: see [142, §3.6], [465, pp. 281-289], or [538, §4.5].) The 
fine characterisation of the “Euler—Maclaurin constants” (Euler’s constant y for Hn, Stirling’s 
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constant ./2z for Stirling’s approximation) is in general non-obvious: see pp. 238, pp. 410, and 
pp. 766 for complex-analytic alternatives. dq 


A.3. Combinatorial probability 


This entry gathers elementary concepts from probability theory specialized to the 
discrete case and used in Chapter III. A more elaborate discussion of probability 
theory forms the subject of Appendix C. 

Given a finite set S, the uniform probability measure assigns to any o € S the 
probability mass 


1 
P(c) = ——. 
2) = Grd(S) 
The probability of any set, also known as event, € C S, is then measured by 


_ card(€) | 
ae card(S) 2Re) 


(“the number of favorable cases over the total number of cases’’). 

Given a combinatorial class A, we make extensive use of this notion with the 
choice of S = A,. This defines a probability model (indexed by 7), in which elements 
of size n in A are taken with equal likelihood. For this uniform probabilistic model, 
we write 

Py and PA,» 


whenever the size and the type of combinatorial structure considered need to be em- 
phasized. 

Next consider a parameter y, which is a function from S to Zo. We regard such 
a parameter as a random variable, determined by its probability distribution, 


card ({o | 4(o) = 4) 

card(S) : 
The notions above extend gracefully to non-uniform probability models that are de- 
termined by a family of non-negative numbers (pz )¢¢s which add up to 1: 


P(o)=pe, PE)=)> po, Px=H)= >) po 


ac€ y(a)=k 


Py =k) = 


Moments. Important information on a distribution is provided by its moments. 
We state here the definitions for an arbitrary discrete random variable supported by Z 
and determined by its probability distribution, P(X = k) = px where the (px) ez 
are non-negative numbers that add up to 1. The expectation of f (X) is defined as the 
linear functional 


(f(X)) = DO P(X =k}: FW). 
k 
In particular, the (power) moment of order r is defined as the expectation: 


E(X") = DO P{X =k}-k’. 
k 
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Of special importance are the first two moments of the random variable X. The 
expectation (also mean or average) E(X) is 


4(X) =D P(X =k} -k. 
k 


The second moment E(X7) gives rise to the variance, 
V(X) = E ((X — E(X))*) = EX?) — EX)’, 


and, in turn, to the standard deviation 


o(X) = /V(X). 


The mean deserves its name as first observed by Galileo Galilei (1564-1642): if a 
large number of draws are effected and values of X are observed, then the arithmetical 
mean of the observed values will normally be close to the expectation E(X). The 
standard deviation measures in a mean quadratic sense the dispersion of values around 
the expectation E(X). 


> A.8. The weak law of large numbers. Let (Xx) be a sequence of mutually independent 
random variables with a common distribution. If the expectation uw = E(X;) exists, then for 
every € > 0: 


: 1 
slim ( ae +--++ Xn) — wl > c) = 0. 
(See [205, Ch X] for a proof.) Note that the property does not require finite variance. J 


Probability generating function. The probability generating function (PGF) of 
a discrete random variable X, with values in Zso, is by definition: 


pu) = oO P(X =bu', 
k 


and an alternative expression is p(w) = E(u*). Moments can be recovered from the 
PGF by differentiation at 1, for instance: 


2 


d 
, E(X(X —1))= 7) 


: 7 d 
(Xx) = 7 P 


u=1 


u=1 
More generally, the quantity, 


dé 
O(X(X — 1)--- (XK -k +) = Fat? 


> 
u=1 


is known as the kth factorial moment. 


> A.9. Relations between factorial and power moments. Let X be a discrete random variable 
with PGF p(u); denote by “7, = E(X") its rth moment and by ¢; its factorial moment. One 
has 


Mr =O Pe)|,_-9 or = O% Pu)|,,—1 - 
Consequently, with {i} and | the Stirling numbers of both kinds (Appendix A.8: Stirling 


numbers, p. 735), 
or =Dev | as Lr => {5 ]4. 
J f 
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(Hint: for ¢- — m,, expand the Stirling polynomial defined in (17), p. 736; in the converse 
direction, write p(e’) = p( + (e’ — 1)).) dq 


Markov—Chebyshev inequalities. These are fundamental inequalities that apply 
equally well to discrete and to continuous random variables (see Appendix C for the 
latter). 


Theorem A.1 (Markov—Chebyshev inequalities). Let X be a non-negative random 
variable and Y an arbitrary real random variable. One has for an arbitrary t > 0: 


P{X > tE(X)} < (Markov inequality) 


1 
t 
1 


P{Y-E(Y)|>to(Y)} < (Chebyshev inequality). 


2 


~ 


Proof. Without loss of generality, one may assume that X has been scaled in such 
a way that E(X) = 1. Define the function f(x) whose value is 1 if x > t, and 0 
otherwise. Then 


P{X > t} = E(f(X)). 
Since f(x) < x/t, the expectation on the right is less than 1/t. Markov’s inequality 


follows. Chebyshev’s inequality then results from Markov’s inequality applied to X = 
\Y -E(Y)/. 7 


Theorem A.1 informs us that the probability of being much larger than the mean 
must decay (Markov) and that an upper bound on the decay is measured in units given 
by the standard deviation (Chebyshev). 

Moment inequalities are discussed for instance in Billingsley’s reference trea- 
tise [68, p. 74]. They are of great importance in discrete mathematics where they 
have been put to use in order to show the existence of surprising configurations. This 
field was pioneered by Erdés and is often known as the “probabilistic method” [in 
combinatorics]; see the book by Alon and Spencer [13] for many examples. Moment 
inequalities can also be used to estimate the probabilities of complex events by reduc- 
ing the problems to moment estimates for occurrences of simpler configurations—this 
is one of the bases of the “first and second moment methods”, again pioneered by 
Erdés, which are central in the theory of random graphs [76, 355]. Finally, moment 
inequalities serve to design, analyse, and optimize randomized algorithms, a theme 
excellently covered in the book by Motwani and Raghavan [451]. 


A.4. Cycle construction 


The unlabelled cycle construction is introduced in Chapter I and is classically 
obtained within the framework of Pélya theory (Note I.58, p. 85 and [129, 488, 491]). 
The derivation given here is based on an elementary use of symbolic methods that 
follows [259]. It relies on bivariate GFs developed in Chapter II, with z marking size 
and u marking the number of components. Consider a class A and the sequence class 
S = SEQs;(A). A sequence o € S is primitive (or aperiodic) if it is not the repetition 
of another sequence (e.g., afBaa is primitive, but aBaf = (af)? is not). The class 
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PS of primitive sequences is determined implicitly, 


= uA(z) = bok 
S(z,u) = as WA © 22 PSE Ju") 


which expresses that every sequence possesses a “root” that is primitive. Mébius 
inversion (Equation (3), p. 722) then gives 


kA k 
PS(2,u) = > wWSeE ud) =F ney 


uk k 
k>1 k>1 AG i 


A cycle is primitive if all of its linear representations are primitive. There is an 
exact one-to-f correspondence between primitive f—cycles and primitive £—sequences. 
Thus, the BGF PC(z, u) of primitive cycles is obtained by effecting the transforma- 


c 1 ul 


tion uu" +> zu" on PS(z, uv), which means 


e dv 
Pctu)= | PS(z,v) > 
0 0) 


giving after term-wise integration, 


PC(z,u) = > um lo 


k>1 


1 
1—uk A(zk) 


Finally, cycles can be composed from arbitrary repetitions of primitive cycles 
(each cycle has a primitive “root”), which yields for C = CYC(A): 


CG) = >) PCG a). 
k>1 
The arithmetical identity > ak H(d)/d = (k)/k gives eventually 
k 1 
(10) Ce,u) = og 


— yk ky)* 
ce 1 —uk A(z*) 


Formula (10) is reduced to the formula that appears in the translation of the cy- 
cle construction in the unlabelled case (Theorem I.1, p. 27), upon setting u = 1; this 
formula also coincides with the statement of Proposition III.5, p. 171, regarding the 
number of components in cycles, and it yields the general multivariate version (Theo- 
rem III.1, p. 165) by a simple adaptation of the argument. 


A.5. Formal power series 


Formal power series [330, Ch. 1] extend the usual algebraic operations on poly- 
nomials to infinite series of the form 


(11) f= fe 
n>0 


where z is a formal indeterminate. The notation f(z) is also employed. Let K be a 
ring of coefficients (usually we shall take one of the fields Q, R, C); the ring of formal 
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power series is denoted by K[[z]] and it is the set KN of infinite sequences of elements 
of K, written as infinite sums (11), endowed with the operations of sum and product: 


(= is’) z > «') 2 (fn + 8n) 2" 
(4) (Zar) = BF aes)s 


n 

A topology, known as the formal topology, is put on K[[z]] by which two series 
f, g are “close” if they coincide to a large number of terms. First, the valuation of a 
formal power series f = >°,, fnz” is the smallest r such that f, 4 0 and is denoted 
by val(f). (One sets val(0) = +00.) Given two power series f and g, their distance 
d(f, g) is then defined as 2~ “!(/—-8), With this distance (in fact an ultrametric dis- 
tance), the space of all formal power series becomes a complete metric space. The 
limit of a sequence of series { f)} exists if, for each n, the coefficient of order n in 
f eventually stabilizes to a fixed value as j —> oo. In this way formal convergence 
can be defined for infinite sums: it suffices that the general term of the sum should 
tend to 0 in the formal topology, i.e., the valuation of the general term should tend 
to oo. Similarly for infinite products, where [[(1 + uJ)) converges as soon as u\/) 
tends to 0 in the topology of formal power series. 

It is then a simple exercise to prove that the sum O(f) := D°ps9 f * exists (the 
sum converges in the formal topology) whenever fy = 0; the quantity then defines the 
quasi-inverse written (1 — f)~!, with the implied properties with respect to multipli- 
cation (namely, Q(f)(1 — f) = 1). In the same way one defines formally logarithms 
and exponentials, primitives and derivatives, etc. Also, the composition fog is defined 
whenever go = 0 by substitution of formal power series. More generally, any process 
on series that involves only finitely many operations at each coefficient is well-defined 
and is accordingly a continuous functional in the formal topology. 

It can then be verified that the usual functional properties of analysis extend to 
formal power series provided they make sense formally; for instance, the logarithm 
and the exponential of formal power series, as defined by their usual expansions, are 
inverses of one another (e.g., log(exp(zf)) = zf; exp(log(] + zf)) = 1+zf). The 
extension to multivariate formal power series follows along entirely similar lines. 


> A.10. The OGF of permutations. The ordinary generating function of permutations, 
CO 
P(z) = Donte” = 147422? +623 +2424 + 12029 + 7202° + 504027 + --- 
n=0 
exists as an element of C[[z]], although the series has radius of convergence 0. The quantity 
1/P(z) is well-defined (via the quasi-inverse) and one can effectively compute 1 — 1/P(z) 


whose coefficients enumerate indecomposable permutations (p. 90). The formal series P(z) 
can even be made sense of, analytically, but as an asymptotic series (Euler [198]), since 


[o-e) et 
[ died 224217" $31 eat ee es OP: 
o l+tz 


Thus, the OGF of permutations is also representable as the (formal, divergent) asymptotic series 
associated to an integral. 
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A.6. Lagrange inversion 


Lagrange inversion (Lagrange, 1770) relates the coefficients of the compositional 
inverse of a function to coefficients of the powers of the function itself (see [129, $3.8] 
and [330, §1.9]). It thus establishes a fundamental correspondence between functional 
composition and standard multiplication of series. Although the proof is technically 
simple, the result is altogether non-elementary. 

The inversion problem z = h(y) consists in expressing y as a function of z; 
it is solved by the Lagrange series given below. It is assumed that [y°]h(y) = 0, 
so that inversion is formally well defined, and [y']h(y) 4 0. The problem is then 
conveniently standardized by defining ¢(y) = y/h(y). 


Theorem A.2 (Lagrange Inversion Theorem). Let d(u) = >?xs0 du* be a power 
series of C[[u]] with do 4 0. Then, the equation y = z(y) admits a unique solution 
in C[[z]] whose coefficients are given by (Lagrange form) 


[o.@) 
Tee 
(12) yz) = Daa where yy, = —[u""]d(u)". 
n=1 u 
Furthermore, one has for k > 0 (Biirmann form) 
— k 
(13) yz) => yz", where y = — fu" *} ou)". 
n 
n=1 


By linearity, a form equivalent to Biirmann’s (13), with H an arbitrary function, is 


(14) [e" HO) = Le" (H we"). 


Proof. The method of indeterminates coefficients provides a system of polynomial 
equations for {y,} that is seen to admit a unique solution: 


yi=d¢o, y2= Goh, y3 = Gob, + Po d2 ---- 


Since y, only depends polynomially on the coefficients of ¢(u) till order n, one may 
assume without loss of generality, in order to establish (12) and (13), that ¢ is a poly- 
nomial. Then, by general properties of analytic functions, y(z) is analytic at 0 (see 
Chapter IV and Appendix B.2: Equivalent definitions of analyticity, p. 741 for defini- 
tions) and it maps conformally a neighbourhood of 0 into another neighbourhood of 0. 


Accordingly, the quantity ny, = [z’—']y’(z) can be estimated by Cauchy’s coefficient 
formula: 
1 ! dz ‘ : / 
Yn = — y (Zz) — (Direct coefficient formula for y’(z)) 
2in Jor ze 
1 d 
(15) = - | 7 (Change of variable z +> y) 
2im Joy (y/P(y))" 
= [y""]d0)" (Reverse coefficient formula for ¢(y)”). 


In the context of complex analysis, this useful result appears as nothing but an avatar 
of the change-of-variable formula. The proof of Biirmann’s form is similar. | 
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There exist instructive (but longer) combinatorial proofs based on what is known 
as the “cyclic lemma” or “conjugacy principle” [503] for Lukasiewicz words. (See 
Note 1.47, p. 75 and the remarks surrounding Proposition II.7, p. 194.) Another 
classical proof due to Henrici relies on properties of iteration matrices [330, $1.9]; see 
also Comtet’s book for related formulations [129]. 

Lagrange inversion serves most notably to develop explicit formulae for simple 
varieties of trees (Chapters I, p. 66, and II, p. 128), mappings (Subsection IT. 5.2, 
p. 129), planar maps (Chapter VII, p. 516) and more generally for problems involving 
coefficients of powers of functions. 


> A.11. Lagrange—Biirmann inversion for fractional powers. The formula 


en (2) = "guy 


< n+a 


holds for any real or complex exponent a, and hence generalizes Biirmann’s form. One can 
similarly expand log(y(z)/z). J 


> A.12. Abel’s identity. By computing in two different ways the coefficient 
[2 el@tP)y — [eM ye4Y « eBY, 


where y = ze” is the Cayley tree function, one derives a collection of identities 
"(n 
(a+ Ayn+a+ py"! = of >° @ k+ah @—k+ py], 
k=0 


known as Abel’s identities. J 


> A.13. A variant of Lagrange inversion. If y(z) satisfies y = z@(y), then one has zy’ = 
y/(1 — z¢'(y)). Hence, for a function a(y), the chain 


VEO... eesti git lta 
tG) = ly ay) = nl" JAQ), 


where A is such that A’ = a. This, by (14), yields the general evaluation: 


[2] 


yaty) nd n 
T—2') =[u Jaw)golu)”. 


In particular, for (vu) = e”, we have y = T (the Tree function), and [z”]T/(1 — T) = n", 
which gives back the number of mappings of size n. dq 


[2] 


A.7. Regular languages 


A language is a set of words over some fixed alphabet A. The structurally sim- 
plest (yet non-trivial) languages are the regular languages that, as asserted on p. 57, 
can be defined in several equivalent ways (see [6, Ch. 3] or [189]): by regular expres- 
sions, either ambiguous or not, and by finite automata, either deterministic or non- 
deterministic. Our definitions of S—regularity (S as in specification) and A-—regularity 
(A as in automaton) from Section I. 4, p. 49, correspond to definability by unambigu- 
ous regular expression and deterministic automaton, respectively. 
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Regular expressions and ambiguity. Here is first the classical definition of a 
regular expression in formal language theory. 


Definition A.1. The category RegExp of regular expressions is defined inductively 
by the property that it contains all the letters of the alphabet (a € A) as well as the 
empty symbol €, and is such that, if R1, Ro € RegExp, then the formal expressions 
R, U Ro, Ry - Ro and Ry are regular expressions. 


Regular expressions are meant to denote languages. The language L(R) asso- 
ciated to R is obtained by interpreting “U’ as set-theoretic union, ‘-’ as catenation 
product extended to sets and “’ as the star operation: L(R*) := {e} U L(R) U 
(L(R) - L(R)) U---. These operations, since they rely on set-theoretic operations, 
place no condition on multiplicities (a word may be obtained in several different 
ways). Accordingly, the notions of a regular expression and a regular language are 
useful when studying structural properties of languages, but they must be adapted for 
enumeration purposes, where unambiguous specifications are needed. 

A word w € L(R) may be parsable in several ways according to R: the ambiguity 
coefficient (or multiplicity) of w with respect to the regular expression R is defined! 
as the number of parsings and written x(w) = xr(w). 

A regular expression R is said to be unambiguous if for all w, we have xr(w) € 
{0, 1}, ambiguous otherwise. In the unambiguous case, if £ = L(R), then £ is S— 
regular in the sense of Chapter I, and a specification is obtained by the translation 
rules 


(16) Ub+, -»xX, ()* 8 SEQ, 


so that the translation mechanism afforded by Proposition I.2 p. 52 applies. (Use of 
the general mechanism (16) in the ambiguous case would imply that we enumerate 
words with multiplicities [ambiguity coefficients] taken into account.) 


A-regularity implies S-regularity. This construction is due to Kleene [367] 
whose interest had its origin in the formal expressive power of nerve nets. Within 
the classical framework of the theory of regular languages, it produces from an au- 
tomaton (possibly non-deterministic) a regular expression (possibly ambiguous). 

For our purposes, let a deterministic automaton a (as defined in Subsection I. 4.2, 
p. 56) be given, with alphabet A, set of states Q, with go and Q the initial state 
and the set of final states respectively (Definition I.11, p. 56). The idea consists in 
constructing inductively the family of languages ag of words that connect state g; to 
state qj passing only through states go, ..., q, in between q; and q;. We initialize the 
data with pe to be the singleton set {a} if the transition (qj 0 a) = qj; exists, and 
the emptyset (G) otherwise. The fundamental recursion 


pe = elie +6000 SEAS LO DLIE”, 


incrementally takes into account the possibility of traversing the “new” state q,. 
(The unions are clearly disjoint and the segmentation of words according to passages 


! For instance if R = (a U aa)* and w = aaaa, then x(w) = 5 corresponding to the five parsings: 
a:a:a-a, d@:a-:aa, a:aa-:da, da-:a-a, aa-aa. 
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5 idee” Unambiguous General 
aseulanly: = RegExp RegExp 
TK JI 


. inisti RS Ei inisti 
A-tegularity = Deterministic Non-deterministic 


Figure A.1. Equivalence between various notions of regularity: K is Kleene’s con- 
struction; RS is Rabin—Scott’s reduction; I is the inductive construction of the text. 


through state g, is unambiguously defined, hence the validity of the sequence con- 
struction.) The language £ accepted by a is then given by the regular specification 


1Ql 
L= Dy; 
qjeQ 
that describes the set of all words leading from the initial state go to any of the final 
states while passing freely through any intermediate state of the automaton. 


S-regularity implies A-regularity. An object described by a regular specifica- 
tion t can be first encoded as a word, with separators indicating the way the word 
should be parsed unambiguously. These encodings are then describable by a regular 
expression using the correspondence of (16). Next any language described by a regular 
expression is recognizable by an automaton (possibly non-deterministic) as shown by 
an inductive construction. (We only state the principles informally here.) Let —] t > 
represent symbolically the automaton recognizing the regular expression t, with the 
initial state represented by an incoming arrow on the left and the final state(s) by an 
outgoing arrow on the right. Then, the rules are schematically 


t+$fh->- “o~«<, 
nea 


ca} ~ HG) 
Ae a 


Finally, a standard result of the theory, the Rabin—Scott theorem, asserts that any 
non-deterministic finite automaton can be emulated by a deterministic one. (Note: 
this general reduction produces a deterministic automaton whose set of states is the 
powerset of the set of states of the original automaton; it may consequently involve an 
exponential blow-up in the size of descriptions.) 


A.8. Stirling numbers. 


These numbers count among the most famous ones of combinatorial analysis. 
They appear in two kinds: 
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e the Stirling cycle number (also called ‘of the first kind’) Fal enumerates 
permutations of size n having k cycles; 

e the Stirling partition number (also called ‘of the second kind’) {i} enumer- 
ates partitions of an n-set into k non-empty equivalence classes. 


n 


The notations (FI and {i} proposed by Knuth (himself anticipated by Karamata) are 
nowadays most widespread; see [307]. 

The most natural way to define Stirling numbers is in terms of the “vertical” EGFs 
when the value of k is kept fixed: 


——- 
xs SS 
LW 
SIA, 
| 
Pel 
e=—=_ 
lie} 
an 
[Je 
N 
———” 
es 


a ge 


n>0 


From here, the bivariate EGFs follow straightforwardly: 


& 1 
Ss Wes = exp (uo) = (l-2z)“ 


n,k>0 

n vhs 
>: ces = exp (u(e* — 1). 
n,k>0 


Stirling numbers and their cognates satisfy a host of algebraic relations. For in- 
stance, the differential relations of the EGFs imply recurrences reminiscent of the 


binomial recurrence 
n| _ [n- 1 £4 n—1l 
kk} |k-1 k | 


n n—1 +( 1) n—-1 
= n—-—- 
k k-1 a 
By techniques akin to Lagrange inversion or by expanding the powers in the vertical 
EGF of the Stirling partition numbers, one finds explicit forms 


ee yy pite h\ (n-1+4h\( 2n—k \(h— pyre 
ky O< j<h<n-k J/\n-k+h}]\n—-k—-h h!\ 
k 


t) = EE Qevew 


j=0 


Although comforting, these forms are not too useful in general, due to their sign al- 
ternation. (The one relative to Stirling cycle numbers was obtained by Schlomilch 
in 1852; see [129, p. 216].) 


An important relation is that of the generating polynomials of the ("| for fixed n, 


n 


(17) Pat) = | |al =ut ) W#2) AAD, 


r=0 
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which nicely parallels the OGF for the ae for fixed r: 


fn ies Zz 
|r ~ G@—2 — 22) (ra) 


n=0 


> A.14. Schlémilch’s formula. It is established starting from 


kifn} 1 f , 1 az 
~~ ra log ris ie 
ni|k 2in l—z ziti 


via the change of variable a Ja Lagrange: z = 1 — e~*. See [129, p.216] and [251]. J 


A.9. Tree concepts 


In the abstract graph-theoretic sense, a forest is an acyclic (undirected) graph and 
a tree is a forest that consists of just one connected component. A rooted tree has a 
specific node is distinguished, the root. Rooted trees are drawn with the root either 
below (the mathematician’s and botanist’s convention) or on top (the genealogist’s 
and computer scientist’s convention), and in this book, we employ both conventions 
interchangeably. Here then are two planar representations of the same rooted tree 


a* a* 

| | 

b b 
(18) SOO as 

Co ee 

gon oe 


where the star distinguishes the root. (Tags on nodes, a, b,c, etc, are not part of the 
tree structure but only meant to discriminate nodes here.) A tree whose nodes are 
labelled by distinct integers then becomes a labelled tree, this in the precise technical 
sense of Chapter II. Size is defined by the number of nodes (vertices). Here is for 
instance a labelled tree of size 9: 


Pee 


(19) 9 3 2 
™~ ™~ 
6 4 8 


In a rooted tree, the outdegree of a node is the number of its descendants; with the 
sole exception of the root, outdeegree is thus equal to degree (in the graph-theoretic 
sense, i.e., the number of neighbours) minus 1. Once this convention is clear, one 
usually abbreviates “outdegree” by “degree” when speaking of rooted trees. A leaf is 
a node without descendant, that is, a node of (out)degree equal to 0. For instance the 
tree in (19) has five leaves. Non-leaf nodes are also called internal nodes. 

Many applications from genealogy to computer science require superimposing 
an additional structure on a graph-theoretic tree. A plane tree (sometimes also called 
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Figure A.2. Three representations of a binary tree. 


a planar tree) is defined as a tree in which subtrees dangling from a common node 
are ordered between themselves and represented from left to right in order. Thus, the 
two representations in (18) are equivalent as graph-theoretic trees, but they become 
distinct objects when regarded as plane trees. 

Binary trees play a very special role in combinatorics. These are rooted trees 
in which every non-leaf node has degree 2 exactly as, for instance, in the first two 
drawings of Figure A.2. In the second case, the leaves have been distinguished by ‘0’ 
The pruned binary tree (third representation) is obtained from a regular binary tree by 
removing the leaves—such a tree then has unary branching nodes of either one of two 
possible types (left- or right-branching). A binary tree can be fully reconstructed from 
its pruned version, and a tree of size 2n + 1 always expands a pruned tree of size n. 

A few major classes are encountered throughout this book. Here is a summary”. 


general plane trees (Catalan trees) G = Z x SEQ(G) (unlabelled) 
binary trees A=Z+(2ZxAx A) (unlabelled) 
non-empty pruned binary trees B= Z+4+2(Z x B)+(Z x B x B) (unlabelled) 
pruned binary trees C=14+(2xBx Bb) (unlabelled) 
general non-plane trees (Cayley trees) T = Z x SET(T) (labelled) 


The corresponding GFs are, respectively, 


1—-J/1—4z 1- V1 —422 1-2z-J1—4z 


G(z) 5 » AZ) - » Bi) - 
ie ea T(z) = zel@), 

2z 
being of type OGF for the first four and EGF for the last one. The corresponding 
counts are 


ae 2n —2 A = 1 2n ape! 1 2n (n> 1) 
ae n-1)° antl ~ Ty n-’ D> poet Fa nt), 
ita 1 2n T= 7"! 
ae | > n=n : 


The common occurrence of the Catalan numbers, (C, = By = A2n+1 = Gn+1) is 
explained by pruning and by the rotation correspondence described on p. 73. 


C(z) = 


2 The term “general” refers to the fact that no degree constraint is imposed. 


APPENDIX B 


Basic Complex Analysis 


This appendix contains entries arranged in alphabetical order regarding the following topics: 
Algebraic elimination; Equivalent definitions of analyticity; Gamma function; Holo- 
nomic functions; Implicit Function Theorem; Laplace’s method; Mellin transform; 
Several complex variables. 

The corresponding notions and results are used starting with Part B, which is relative to Complex 
Asymptotics. The present entries, together with the first sections of Chapter IV, should enable 
a reader, previously unacquainted with complex analysis but with a fair background in basic 
calculus, to follow the main developments of analytic combinatorics. There are a number of ex- 
cellent classic presentations of complex analysis: the books by Dieudonné [165], Henrici [329], 
Hille [334], Knopp [373], Titchmarsh [577], and Whittaker-Watson [604] are of special inter- 
est, given their concrete approach to the subject (see also our comments on p. 286). 


B.1. Algebraic elimination 


Auxiliary quantities can be eliminated from systems of polynomial equations. In 
essence, elimination is achieved by suitable combinations of the equations themselves. 
One of the best strategies is based on Grobner bases and is presented in the excellent 
book of Cox, Little, and O’Shea [135]. This entry develops a more elementary ap- 
proach based on resultants. It is necessitated by the analysis of algebraic curves, 
function, and systems (Sections VII.6, p. 482, and VII. 7, p. 493), with a general 
applicability to context-free structures introduced on p. 79. 


Resultants. Consider a field of coefficients IK, which may be specialized as 
QC, C(z),..., as the need arises. A polynomial of degree d in K[x] has at most 
d roots in K and exactly d in the algebraic closure K of K. Given two polynomials, 


4 m 
P= as. Vey. 
j=0 k=0 
their resultant (with respect to the variable x) is the determinant of order (€ + m), 
ag at a2 °°: 0 0 
0 ao a -::: 0 0 
sel) 20: 20> ste> sae cae 
(1) R(P,Q,x)=det) pb, by. 0 0 |> 
0 bo by -::- 0 0 
0 0 0 ae bind bm 


also called the Sylvester determinant. By its definition, the resultant is a polynomial 
form in the coefficients of P and Q. The main properties of resultants are the fol- 
lowing: (i) ff P(x), Q(x) € K[x] have a common root in the algebraic closure KK of 


739 
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K, then R(P(x), Q(x), x) = 0; (ii) conversely, if R(P (x), Q(x), x) = 0 holds, then 
either ay = bo = 0 or else P(x), Q(x) have a common root in K. (The idea of the 
proof of (i) is as follows. Let S be the matrix in (1). Then the homogeneous linear 
system Sw = 0 admits a solution w = (€6+"7!",..., €*, €, 1) in which € is a com- 
mon root of P and Q; this is only possible if det(S) = R vanishes.) See especially van 
der Waerden’s crisp treatment in [590] and Lang’s treatise [401, V.10] for a detailed 
presentation of resultants 

Equating the resultant to 0 thus provides a necessary condition for the existence 
of common roots, but not always a sufficient one. This has implications in situations 
where the coefficients a;,b,; depend on one or several parameters. In that case, the 
condition R(P, Q, x) = 0 will certainly capture all the situations in which P and Q 
have a common root, but it may also include some situations where there is a reduction 
in degree, although the polynomials have no common root. For instance, take P(x) = 
tx —2 and O(x) = tx? —4 (witht a parameter); the resultant with respect to x is 


R=4r(1 -12). 


Indeed, the condition R = 0 corresponds to either a common root (t = | for which 
P(2) = Q(2) = 0) or to some degeneracy in degree (t = 0 for which P(x) = —2 and 
Q(x) = —4 have no common zero). 


Systems of equations. Given a system 


(2) {Pj (Z, V1, ¥25+++5¥m) = OF, j=l..m, 


defining an algebraic curve, we can then proceed as follows in order to extract a single 
equation satisfied by one of the indeterminates. By taking resultants with P,,, elimi- 
nate all occurrences of the variable y,, from the first m— 1 equations, thereby obtaining 
a new system of m — | equations in m — 1 variables (with z kept as a parameter, so that 
the base field is C(z)). Repeat the process and successively eliminate ym—1,..., y2- 
The strategy (in the simpler case where variables are eliminated in succession exactly 
one at a time) is summarized in the skeleton procedure Eliminate: 


procedure Eliminate (P;,..., Pm, ¥1, 2, ---Ym)3 

{Elimination of y2,..., ¥m by resultants} 

(Aj,..., Am) = (Pj,..-, Pm); 

for j from m by —1 to 2 do 

for k from j — 1 by —1 to 1 do 

Ax := R(Ag, Aj, yj); 

return(A,). 
The polynomials obtained need not be minimal, in which case, one should appeal 
to multivariate polynomial factorization in order to select the relevant factors at each 
stage. (Grébner bases provide a neater alternative to these questions, see [135].) 

Computer algebra systems usually provide implementations of both resultants and 
Groébner bases. The complexity of elimination is, however, exponential in the worst- 
case: degrees essentially multiply, which is somewhat intrinsic. For example, yo in 
the quadratic system of k equations 


yo-z— ye =0, ye -y~ 1 =0,... yi - yp =0 
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(determining the OGF of regular trees of degree 2") represents an algebraic function 
of degree 2* and no less. 
> B.1. Resultant and roots. Let P, Q € C[x] have roots {a ;} and {;}, respectively. Then 


€ m m 
R(P, O, x) = ahh [] [ ] @ -4)) =45 |] 2G@i). 


i=1 j=1 i=1 
The discriminant of P classically defined by D(P) := ae R(P(x), P’(x), x) satisfies 


D(P) = ay! R(P(x), P’(x), x) = 49'~* [] (ai — @;). 
iAj 
Given the coefficients of P and the value of D(P), an effectively computable bound on 
the minimal separation distance 6 between any two roots of P can be found. (Hint. Let 
A = 1 + max;(laj;/ao|). Then each a; satisfies |Jaj] < mA. Set L = (5). Then 


6 > lagl?~7“|D(P)|2A)*—!.) <J 


B.2. Equivalent definitions of analyticity 


Two parallel notions are introduced at the beginning of Chapter IV: analyticity 
(defined by power series expansions) and holomorphy (defined as complex differen- 
tiability). As is known from any textbook on complex analysis, these notions are 
equivalent. Given their importance for analytic combinatorics, this appendix entry 
sketches a proof of the equivalence, which is summarized by the following diagram: 


[A 
Analyticity =, 7 C-differentiability 
[C] J [8] 


Null integral Property 


A. Analyticity implies complex-differentiability. Let f(z) be analytic in the disc 
D(zo; R). We may assume without loss of generality that z7 = 0 and R = 1 (else 
effect a linear transformation on the argument z). According to the definition of ana- 
lyticity, the series representation 


(3) f@O=>o fr", 
n=0 


converges for all z with |z| < 1. Elementary series rearrangements first entail that f(z) 
given by this representation is analytic at any z interior to D(O; 1); similar techniques 
then show the existence of the derivative as well as the fact that the derivative can be 
obtained by term-wise differentiation of (3). See Note B.2 for details. 

> B.2. Proof of [A]: Analyticity implies differentiability. Formally, the binomial theorem pro- 
vides 


f® = DAZ = DY Aertz-x)y" 


n>0 n>0 
“) = DD (j)sde-a 
n>0k=0 


-_ > em(z — 21)", Cm = 3 (” i *) fn. 


m>0 k>0 
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Let 7; be any number smaller than 1 — |z;|. We observe that (4) makes analytic sense. Indeed, 
one has the bound | f,| < CA”, valid for any A > 1 and some C > 0. Thus, the terms in (4) 
are dominated in absolute value by those of the double series 


n 
n Cc 
5 CA” kink _ C Av fo ni : 
(5) > : (‘) lary Alt) Tada) 
n>0k=0 n>0 


which is absolutely convergent as soon as A is chosen such that A < (|z}|+71)7!. 
Complex differentiability of at any z; € D(O; 1) is derived from the analogous calculation, 
valid for small enough 6, 


Dmel +3 > (j) factor 


1 
FAC ACAI +6) — f(z1))) 


(6) n>0 n>0k=2 
= Dinfrzi'|+ 06), 
n>0 
where boundedness of the coefficient of 6 results from an argument analogous to (5). dq 


The argument of Note B.2 has shown that the derivative of f at z; is obtained by 
differentiating termwise the series representing f. More generally derivatives of all 
orders exist and can be obtained in a similar fashion. In view of this fact, the equalities 
of (4) can also be interpreted as the Taylor expansion (by grouping terms according to 
values of k first) 


2, 
(7) fer +0) = fle) +of'@t SH t ; 


which is thus generally valid for analytic functions. 


B. Complex differentiability implies the “Null Integral” Property. The Null Inte- 
gral Property relative to a domain Q is the property: 


[40 dz=0 for any loop 2 Cc Q. 
X 


(A loop is a closed path that can be contracted to a single point in the domain Q.) Its 
proof results from the Cauchy—Riemann equations and Green’s formula. 

> B.3. Proof of |B]: the Null Integral Property. This starts from the Cauchy—Riemann equa- 
tions. Let P(x, y) = Rf (x + iy) and Q(x, y) =S f(x + iy). By adopting successively in the 
definition of complex differentiability 6 = h and 6 = ih, one finds P} + iQ4. = oe = 2 oe 
implying the Cauchy—Riemann equations: 


oP 0a oP 0 
(8) Bea: gry: BE See 
Ox oy oy Ox 
(The functions P and Q satisfy the partial differential equations Af = 0, where A is the two- 
42 
o 


‘ : ; 2 5 * F 
dimensional Laplacian A := + &; such functions are known as harmonic functions.) 
oO 


Ox2 
Ox 
The Null Integral Property, given differentiability, results from the Cauchy—Riemann equations, 
upon taking into account Green’s theorem of multivariate calculus, 


OB OA 
Adx + Bdy = — — — Jdxdy, 
OK K \ 0x oy 


which is valid for any (compact) domain K enclosed by a simple curve 0K. dq 


B.3. GAMMA FUNCTION 743 


C. Complex differentiability implies analyticity. The starting point is the formula 


1 
(9) Oe ae 


2it y ira 


knowing only differentiability of f and its consequence, the Null Integral Property, 
but precisely not postulating the existence of an analytic expansion (here y is a simple 
positive loop encircling a inside a region in which f is analytic). 

> B.4. Proof of [C]: the integral representation. The proof of (9) is obtained by decomposing 
F(z) in the original integral as f(z) = f(z) — f(a) + f(a). Define accordingly g(z) = 
(f(z) — f(@))/(z — a), for z # a, and g(a) = f’(a). By the differentiability assumption, g 
is continuous and holomorphic (differentiable) at any point other than a. Its integral is thus 0 
along y. On the other hand, we have 


by a simple computation: deform y to a small circle around a and evaluate the integral directly 
by setting z-—a=re’?. <J 


Once (9) is granted, it suffices to write, e.g., for an expansion at 0, 


1 
f@ = 5 =I ro 


- Lio ae 


= Vihe", fu AG jose. 


n>0 
(Exchanges of integration and summation are justified by normal convergence.) 
Analyticity is thus proved from the Null Integral Property. 
> B.5. Cauchy’s formula for derivatives. One has 


(n) n! ff) 
fr @= ee ree 


This follows from (9) by differentiation under the integral sign. dq 


> B.6. Morera’s Theorem. Suppose that f is continuous [but not a priori known to be differ- 
entiable] in an open set Q and that its integral along any triangle in © is 0. Then, f is analytic 
(hence holomorphic) in Q. (For details, see, e.g, [497, p. 68].) This theorem is useful for 
disposing of apparent (or “removable’’) singularities, as in (cos(z) — 1)/ sin(z). <i 


B.3. Gamma function 


The formulae of singularity analysis in Chapter IV involve the Gamma function 
in an essential manner. The Gamma function extends to non-integral arguments the 
factorial function. We collect in this appendix a few classical facts regarding it. Proofs 
may be found in classic treatises like Henrici’s [329] or Whittaker and Watson’s [604]. 
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6 


4 


2 


Dp 


Figure B.1. A plot of I'(s) for real s. 


Basic properties. Euler introduced the Gamma function as 


(10) roy= [ete tar, 
0 


where the integral converges provided R(s) > 0. Through integration by parts, one 
immediately derives the basic functional equation of the Gamma function, 


(11) Te+)=sT(). 


Since ['(1) = 1, one has ['(n + 1) = n!, so that the Gamma function serves to extend 
the factorial function for non-integral arguments. The special value, 


(12) r(5) i: Daa [ ee aavE 
= = e —_z = e x= 7, 
2 0 Jt 0 


proves to be quite important. It implies in turn T(-4) = —2,/7. 

From (11), the Gamma function can be analytically continued to the whole of C 
with the exception of poles at 0, —1, —2,.... indeed, the functional equation used 
backwards yields 
ae) eee 

m! s+m 
so that the residue of I'(s) at s = —m is (—1)'"/m!. Figure B.1 depicts the graph of 
T'(s) for real values of s. 


T(s) ~ Sw) 5 


> B.7. Evaluation of the Gaussian integral. Define J := i e-*” dx. The idea is to evalu- 


ate J?: 
CO CO 2 2 
rs =) | ee +Y) dxdy. 
0 Jo 


Going to polar coordinates, (x24 y?) 1/2 


change of variables formula: 
co rt 2 
P= | | e?” pdpdd. 
0 0 


The equality J? = /4 results. dq 


= p,x = pcos, y = psin@ yields, via the standard 
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Hankel contour representation. Euler’s integral representation of I'(s) used in 
conjunction with the functional equation permits us to continue I'(s) to the whole of 
the complex plane. A direct approach due to Hankel provides an alternative integral 
representation valid for all values of s. 


Theorem B.1 (Hankel’s contour integral). Let bie denote an integral taken along 
a contour starting at +00 in the upper plane, winding counterclockwise around the 
origin, and proceeding towards +00 in the lower half-plane. Then, for all s € C, 


(13) Meine Lf yet at 
— SIN(7S SSS. SS SS = “e é 
1 T(s) 21m J4oo 


In (13), (—t)~* is assumed to have its principal determination when f is negative real, 
and this determination is then extended uniquely by continuity throughout the contour. 
The integral then closely resembles the definition of [(1 — s). The first form of (13) 
can also be rewritten as ae by virtue of the complement formula given below. 


> B.8. Proof of Hankel’s representation. We refer to volume 2 of Henrici’s book [329, p. 35] 
or Whittaker and Watson’s treatise [604, p. 245] for a detailed proof. 

A contour of integration that fulfills the conditions of the theorem is typically the contour 
H that is at distance 1 of the positive real axis comprising three parts: a line parallel to the 
positive real axis in the upper half-plane; a connecting semi-circle centered at the origin; a line 
parallel to the positive real axis in the lower half-plane. More precisely, # = H™~ UH* UH, 
where 


H- = {z=w-i, w>0} 
(14) Ht = {c=w+ti, w>0} 
Ho = {c= -e!?, Pe[-5, FI}. 


Let € be a small positive real number, and denote by € - H the image of 1 by the trans- 
formation z > €z. By analyticity, for the integral representation, we can equally well adopt as 
integration path the contour € - 1, for any € > 0. The main idea is then to let € tend to 0. 

Assume momentarily that s < 0. (The extension to arbitrary s then follows by analytic 
continuation.) The integral along € - 7{ decomposes into two parts: 


1. The integral along the semi-circle is 0 if we take the circle of a vanishing small 
radius, since —s > 0. 
2. The combined contributions from the upper and lower lines give, as € > 0 


(0) 0° 
(-t) Se dt =(-U+ 1) | t Se! dt 
+00 0 


where U and L denote the determinations of (—1)~* on the half-lines lying in the 
upper and lower half-planes respectively. 


By continuity of determinations, U = (e~'")~* and L = (e+'”)~%. Therefore, the right-hand 
side of (13) is equal to 


_Ias -—ims A 
Secs Aa 7 -—s)= sine) FG = 
2in 1 


which completes the proof of the theorem. dq 
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Expansions. The Gamma function has poles at the non-positive integers but has 


no zeros. Accordingly, 1/I(s) is an entire function with zeros at 0, —1,..., and the 
position of the zeros is reflected by the product decomposition, 
(15) ey Tl [a a ~e8/"] 
T(s) n 
n=1 

(of the so-called Weierstrass type). There y = 0.57721 denotes Euler’s constant 

rd 1 

y = lim (Hy —logn) = pa E —log(1 + >| 
n= 


The logarithmic derivative of the Gamma function is classically known as the psi 
function and is denoted by y(s): 


d I’ 
y(s) = ae log I'(s) = wer 


In accordance with (15), y(s) admits a partial fraction decomposition 


co 


1 1 
(16) vo+0=-7- >| ——- 2]. 


n=1 


From (16), it can be seen that the Taylor expansion of y(s + 1), and hence of I'(s + 
1), involves values of the Riemann zeta function, ¢(s) = “>, x, at the positive 
integers: for |s| < 1, 


y(s+1)=-y + Do (-D*6@)s" 1. 


n=2 


so that the coefficients in the expansion of I'(s) around any integer are polynomi- 
ally expressible in terms of Euler’s constant y and values of the zeta function at the 
integers. For instance, as s > 0, 


2 2 2 3 
ret nat-yea( Fab )ee(-F Y _? Je rou% 


12 2 3 12 6 


Another direct consequence of the infinite product formulae for I'(s) and sin zs 
is the complement formula for the Gamma function, 


oa 


(17) P(s)P(-s) = — 


ssinzs’ 


which directly results from the factorization of the sine function (due to Euler), 


CO 2 
sins =s I] 1- 7 |: 
n° 
n=1 


In particular, Equation (17) gives back the special value (cf (12)): [(1/2) = /z. 
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> B.9. The duplication formula. This is 
2*5-IP(s)P(s + 1/2) = 2!/7T Qs), 


which provides the expansion of I’ near 1/2: 


5/2 2 Joe 2)2 71/2 
T@4+1)) $717 6 +2022 2e4(% rates “= ye ) 2 2063): 


The coefficients now involve log 2 as well as zeta values. dq 


Finally, a famous and absolutely fundamental asymptotic formula is Stirling’s 
approximation, familiarly known as “Stirling’s formula”: 


1 1 139 
T 1) =sI(s)~ s’e *V2 1 - vee], 
ee) ore ans + 195 7 28852 — 5184053 * 
It is valid for (large) real s € Rs, and more generally for all s — oo in |arg(s)| < 
mz — 6 (any 6 > 0). For the purpose of obtaining effective bounds, the following 
quantitative relation [604, p. 253] often proves useful, 


T(s +1) =s%e7$ (Qrs)/2 09/025) where 0 < 6 = @(s) < 1, 


an equality that holds now for all s > 1. Stirling’s formula is usually established by 
appealing to the method of Laplace applied to the integral representation for ['(s + 
1), see Appendix B.6: Laplace’s method, p. 755, or by Euler—Maclaurin summation 
(Note A.7, p. 726). It is derived by Mellin transforms in Appendix B.7, p. 762. 


> B.10. The Eulerian Beta function. It is defined for R(p), R(g) > 0 by any of the following 
integrals, 


eed 1 eo yh! ee Pe ae | 
B(p,q) =f xP~*(1—x)t~* dx = ——__ dy= a cos“? * @ sin“4—* 6 dé, 
0 oer 0 

where the last form is known as a Wallis integral. It satisfies: 

Pg) 

Bip, gq) = =~. 

T(p +4) 

[See [604, p. 254] for a proof generalizing that of Note B.7.] J 


> B.11. Facts about the Riemann zeta function (¢). Here are a few properties of this function, 
whose elementary theory centrally involves the Gamma function. It is initially defined by 


1 
[= aa R(s) > 1, 
n>1 


and it admits a meromorphic expansion to the whole of C, with only a pole at s = 1, where 
¢(s) = 1/(s—1)+y +--- and y is Euler’s constant. Special values for k € Z 5 are 


27k N Bowl 2k Box 
(2k)! : 2k? 


with Bj; a Bernoulli number. Other interesting values are ¢(0) = —}, ¢/(0) = —log V2z. 
The functional equation admits many forms, among which the reflection formula: 


r(5)2*?¢@) =F (>) n—A-S/2¢(1 — 5), 


¢(2k) = ¢(—2k + 1) = — ¢(—2k) = 0, 


2 


The proofs make an essential use of Mellin transforms (Appendix B.7, p. 762, and especially 
Equation (46), p. 764) as well as Hankel contours. Accessible introductions are to be found 
in [186, 578, 604]. qd 
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B.4. Holonomic functions 


Doron Zeilberger [626] has introduced discrete mathematicians to a powerful 
framework, the holonomic framework, which takes its roots in classical differential 
algebra [72, 133] and has found innumerable applications in the theory of special 
functions and symbolic computation [480], combinatorial identities, and combinato- 
rial enumeration. In these pages, we can only offer a (too) brief orientation tour of this 
wonderful theory. Major contributions in the perspective of Analytic Combinatorics 
are due to Stanley [551], Zeilberger [626], Gessel [289], and Lipshitz [409, 410]. As 
we shall see there is a chain of growing generality and power, 


rational — algebraic — holonomic. 


The associated asymptotic problems are examined in Subsection VII. 9.1, p. 518 (“reg- 
ular” singularities) and Section VIIL 7, p. 581 (“irregular’” singularities). 

Univariate holonomic functions. Holonomic functions! are solutions of linear 
differential equations or systems whose coefficients are rational functions. The uni- 
variate theory is elementary. 


Definition B.1. A formal power series (or function) f(z) is said to be holonomic if it 
satisfies a linear differential equation, 


r r—-1 


d d 
(18) co) FF) +a@ 7 he +--+ er@)F@ =0, 


where the coefficients cj(z) lie in the field C(z) of rational functions. Equivalently, 
Jf (@) is holonomic if the vector space over C(z) spanned by the set of all its derivatives 
{0/ F()} Ro is finite dimensional. 


By clearing denominators, we can assume, if needed, the quantities c;(z) in (18) 
to be polynomials. It then follows that the coefficient sequence (f,,) of a holo- 
nomic f(z) satisfies a recurrence, 


(19) C31) fn4s +5 1) fats p+---+¢0(n) fn = 0, 


for some polynomials Cj (n), provided n > no (some ng). Such a recurrence (19) is 
known as a P—recurrence. (The two properties of sequences, to be the coefficients of 
a holonomic function and to be P-recursive, are equivalent.) 

Functions such as e%,logz,cos(z), arcsin(z), /l1 +z, and Liz(z) := 
De 2"/n* are holonomic. Formal power series like >” z"/(n!)? and > n!z" 


are holonomic. Sequences like 1 ( 


wh i 2" /(n* + 1) are coefficients of holonomic 
functions and are P-recursive. However, sequences like /n,logn are not P- 
recursive, a fact that can be proved by an examination of singularities of associated 
generating functions [232]. For similar reasons, tanz, secz, and I(z) that have 
infinitely many singularities are not holonomic. 
Holonomic functions enjoy a rich set of closure properties. Define the Hadamard 
product of two functions h = f © g to be the termwise product of series: [z”]h(z) = 
([z" ]f (z)) - (iz"]g(z)). We have the following theorem. 


1A synonymous name is 0-finite or D-finite. 
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Theorem B.2 (Univariate holonomic closure). The class of univariate holonomic 
functions is closed under the following operations: sum (+), product (x), Hadamard 
product (©), differentiation (0-), indefinite integration ([*), and algebraic substitu- 
tion (Z  y(z) for some algebraic function y(z)). 


Proof. An exercise in vector space manipulations. For instance, let VS(0* f) be 
the vector space over C(z) spanned by the derivative {d/ f} joo Ifh = ft+e 
(orh = f - g), then VS(6*A) is finite dimensional since it is included in the direct 
sum VS(6* f) @ VS(6*g) (respectively, the tensor product VS(6* f) ® VS(6*g)). For 
Hadamard products, if hy» = fngn, then a system of P—recurrences can be obtained 
for the quantities Awd _ fnt+i8n+ j from the recurrences satisfied by fi, gn, and then 
a single P—recurrence can be obtained. Closure under algebraic substitution results 
from the methods of Note B.12. See Stanley’s historic paper [551] and his book chap- 
ter [554, Ch. 6] for details. | 


> B.12. Algebraic functions are holonomic. Let y(z) satisfy P(z, y(z)) = 0, with P a poly- 
nomial. Any non-degenerate rational fraction Q(z, y(z)) can be expressed as a polynomial 
in y(z) with coefficients in C(z). [Proof: let D be the denominator of Q; the Bezout relation 
AP — BD = 1 (in C(x)Ly]), obtained by a gcd calculation between polynomials (in y), ex- 
presses 1/D as a polynomial in y.] Then, all derivatives of y live in the space spanned over 


C(z) by l,y,..., yet, with d = deg,, P(z, y). (The fact that algebraic functions are holo- 


nomic was known to Abel [1, p. 287], and an algorithm has been described in recent times by 
Comtet [128].) The closure under algebraic substitutions (y +» y(z)) asserted in Theorem B.2 
can be established along similar lines. dq 


Zeilberger observed that holonomic functions with coefficients in Q can be spe- 
cified by a finite amount of information. Equality in this subclass is then a decidable 
property, as the following skeleton algorithm suggests (detailed validity conditions are 
omitted). 


Algorithm Z: Decide whether two holonomic functions A(z), B(z) are equal 
Let 2, T be holonomic descriptions of A, B (by equations or systems); 
Compute a holonomic differential equation Y for h := A — B; 

Let e be the order of Y. 

Output ‘equal’ iff (0) = h’(0) =--- = h@-)(0) = 0, with e the order of Y. 


The book titled “A = B” by PetkovSek, Wilf, and Zeilberger [480] abundantly illus- 
trates the application of this method to combinatorial and special function identities. 
Interest in the approach is reinforced by the existence of powerful symbolic manip- 
ulation systems and algorithms: Salvy and Zimmermann [531] have implemented 
univariate algebraic closure operations; Chyzak and Salvy [120, 123] have developed 
algorithms for multivariate holonomicity discussed below. 


Example B.1. The Euler—Landen identities for dilogarithms. Let as usual Lig(z) := 
Dn>1 2" /n® represent the polylogarithm function (p. 408). Around 1760, Landen and Eu- 
ler discovered the dilogarithmic identity [52, p. 247], 


l-z 2 


me (- ) =-5 log”(1 — z) — Lig(z), 
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which corresponds to the (easy) identity on coefficients (extract [z’]) 


Lise =. 2 tee. a 
Cy =) ke Se ew, 


k=1 


and specializes (at z = 1/2) to the infinite series evaluation 


1 1 a 1. 5 
Li = = — — log“ 2. 
2 (5) Dian = a 


n>1 


Write A and B for the left and right sides of (20), respectively. The differential equations for A, 
B are built in stages, according to closure properties: 


Li, (z): (1 —z)é2y — dy =0 

Li, (2)? : (1 — z)2a3y +3(1 — z)é2y + dy =0 

(22) Li2(z): z(1 — z)@3y + (2 — 3z)67y — dy =0 
Biz): —3(B6z> +---)(1 — z)®a®y + --- — 48(2252> +--+ )dy =0 

A(z): z(1 — z)20°y + (1—z)2 —52z)d2y — (3 — 4z)ay =0 


Thus, A — B lives a priori in a vector space of dimension 12 = 3 + 9. It thus suffices to check 
the coincidence of the expansions of both members of (20) up to order 12 in order to prove the 
identity A = B. (An upper bound on the dimension of the vector space is actually enough.) 
Equivalently, given the automatic computations of (22), it suffices to verify sufficiently many 
cases of the identity (21) in order to have a complete proof of it. ............. 0.00 eee eee | 


> B.13. Holonomic functions as solutions of systems. (This is a simple outcome of Note VII.48, 
p. 522.) A holonomic function y(z) which satisfies a linear differential equation of order m with 
coefficients in C(z) is also the first component of a first-order differential system of dimension m 
with rational coefficients: y(z) = Y)(z), where 


d 

qiuw = ayy @)¥1 +++ + am @)¥m(@) 
(23) : 

d 

ain) = ani (Z)¥ ++++ + mm (Z)¥m (2), 
where each aj, ;(z) is a rational function. Conversely, any solution of a system (23) with the 
a;,; € C(z) is holonomic in the sense of Definition B.1. J 


> B.14. The Laplace transform. Let f(z) = Din>0 fnz" be a formal power series. Its (formal) 
Laplace transform g = L[ f] is defined as the formal power series: 


co 


LIF) = >on! fax". 


n=0 


(Thus Laplace transforms convert EGFs into OGFs.) Under suitable convergence conditions, 
the Laplace transform is analytically representable by 


LUI) = i Flaze-2 dz. 


The following property holds: A series is holonomic if and only if its Laplace transform is 
holonomic. (Hint: use P-recurrences (19).] <i 
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> B.15. Hypergeometric functions. It is customary to employ the notation (a), for representing 
the falling factorial a(a—1)---(a—n+1). The function of one variable, z, and three parameters, 
a, b, c, defined by 


o.@) 
(a)n(b)n 2" 
24 Fla, b;c;z]=1 ————_, 
is known as a hypergeometric function. It satisfies the differential equation 
d*y dy 
(25) zl —z)—5 + €-— @+b+1)z) — — aby =0, 
dz dz 


and is consequently a holonomic function. An accessible introduction appears in [604, Ch XIV]. 
The generalized hypergeometric function (or series) depends on p + q parameters 
a],.--,4p and cj,..., Cg, and is defined by 


(41)n-- (ay)n- +: @p)n is 
26 Falay,..-,4p3¢1,--.5¢q3z] =1+ nt 
(26) pFqlar ap> C1 €q3 2 > (cy)n ++ (cq)n nl? 


so that F in (24) is a 7 F,. Hypergeometric functions ae a rich set of identities [193, 542], 
many of which can be verified (though not discovered) by Algorithm Z. 

Multivariate holonomic functions. Let z = (Z1,...,Zm) be a collection of 
variables and C(z) the field of all rational fractions in the variables z. For n = 
(n1,...,7m), we define z” to be z}! +++ zm” and let 6” represent Om +++ Oznm. 

1 ~m 


Definition B.2. A multivariate formal power series (or function) f(z) is said to be 
holonomic if the vector space over C(z) spanned by the set of all derivatives {6" f (z)} 
is finite dimensional. 


Since the partial derivatives a, f are bound, a multivariate holonomic function 
satisfies a differential equation of the form 


r| 


C10) fla) +o Hein @)Fl@) = 0, 
1 
and similarly for z2,...,Zm. (Any system of equations with possibly mixed partial 
derivatives that allows one to determine all partial derivatives in terms of a finite num- 
ber of them serves to define a multivariate holonomic function.) Denominators can be 
cleared, upon multiplication by the I.c.m of all the denominators that figure in the sys- 
tem of defining equations. There results that coefficients of multivariate holonomic 
functions satisfy particular systems of recurrence equations with polynomial coeffi- 
cients, which are characterized in [410]. 

Given f(z) viewed as a function of z1, z2 (the remaining variables being param- 
eters) and abbreviated as f (z1, z2), the diagonal with respect to variables z,, z2 is 


Diag. Lf (1, 22] = Sheth where f(Z1,z2) = Ss Fuses oo 
v ny,n2 


The Hadamard product is defined, as in the univariate case, with respect to a specific 
variable (e.g., Z1). 


Theorem B.3 (Multivariate holonomic closure). The class of multivariate holonomic 
functions is closed under the following operations: sum (+), product (x), Hadamard 
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product (©), differentiation (0), indefinite integration ([), algebraic substitution, spe- 
cialization (setting some variable to a constant), and diagonal. 


An elementary proof of this remarkable theorem (in the sense that it does not 
appeal to higher concepts of differential algebra) is given by Lipshitz in [409, 410]. 
The closure theorem and its companion algorithms [120, 570] make it possible to 
prove, or verify, automatically identities, many of which are non-trivial. For instance, 
in his proof of the irrationality of the number ¢(3) = >). 1/ n>, Apéry introduced 
the combinatorial sequence, 

"\ (n\* (n+k\* 
(27) A= > (7) ( H iE 


k=0 


n>1 


for which a proof was needed [588] of the fact that it satisfies the recurrence 


(28) (n +1)? By + (n + 2)? Byg2 — (2n + 3)(17n? + 51n + 39)Bn41 = 0, 


with Bj = 5, By = 73. Obviously, the generating function B(z) of the sequence 
(B,) as defined by the P—recurrence (28) is univariate holonomic. Repeated use of 
the multivariate closure theorem shows that the ordinary generating function A(z) of 
the sequence A, of (28) is holonomic. (Indeed, start from the explicit 


N1\ nj _no 1 é ) ni _n2 1 
Ze oz, So ——— Z1 Zz _ > 
pa Eee Fee +23) 2 ng Pe PS a 85 


nyn2 
and apply suitable Hadamard products and diagonal operations.) This gives an ordi- 
nary differential equation satisfied by A(z). The proof is then completed by checking 
that A, and B,, coincide for enough initial values of n. 

Holonomic functions in infinitely many variables. Let f be a power series in 
infinitely many variables x1, x2,.... Let S C Zs, bea subset of indices. We write 
Js for the specialization of f in which all the variables whose indices do not belong 
to S are set to 0. Following Gessel [289], we say that the series f is holonomic if, for 
each finite S, the specialization fs is holonomic (in the variables x, for s € S). Gessel 
has developed a powerful calculus in the case of series f that are symmetric functions, 
with stunning consequences for combinatorial enumeration. 

An undirected graph is called k—regular if every vertex has exact degree k. A 
standard Young tableau is the Ferrers diagram of an integer partition, filled with con- 
secutive integers in a way that is increasing along rows and columns. The classical 
Robinson—Schensted—Knuth correspondence establishes a bijection between permu- 
tations of size n and pairs of Young tableaux of size n having the same shape. The 
common height of the tableaux in the pair associated to a permutation o coincides 
with the length of the longest increasing subsequence of o. Ak x n Latin rectangle is 
ak x n matrix with elements in the set {1, 2, ..., } such that entries in each row and 
column are distinct. (It is thus a k-tuple of “discordant” permutations.) 

Gessel’s calculus [288, 289] provides a unified approach for establishing the holo- 
nomic character of many generating functions of combinatorial structures, such as: 
Young tableaux, permutations of uniform multisets, increasing subsequences in per- 
mutations, Latin rectangles, regular graphs, matrices with fixed row and column sums, 
and so on. For instance: the generating functions of Latin rectangles and Young 
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tableaux of height at most k, of k-regular graphs, and of permutations with longest 
increasing subsequence of length k are holonomic functions. In particular, the number 
Yn,x of permutations of size n with longest increasing subsequence < k satisfies 


2n oo 2n+v 


z x 
(29) York ai? = det [i-n@2 ez jen , where /,,(2z) = > ana 
n20 n=0 
that is, a corresponding GF is expressible as a determinant of Bessel functions. Other 
applications are described in [122, 444]. 
The asymptotic problems relative to the holonomic framework are examined in 
Subsection VII. 9.1, p. 518 and Section VIII. 7, p. 581. 


B.5. Implicit Function Theorem 


In its real-variable version, the Implicit Function Theorem asserts that, for a 
sufficiently smooth function F(z, w) of two variables, a solution to the equation 
F(z, w) = 0 exists in the vicinity of a solution point (zo, wo) (therefore satisfying 
F (zo, wo) = 0) provided the partial derivative satisfies F/,(zo, wo) 4 0. This theorem 
admits a complex extension, which is essential for the analysis of recursive structures. 

Without loss of generality, one restricts attention to (zo, wo) = (0,0). We con- 
sider here a function F(z, w) that is analytic in two complex variables in the sense 
that it admits a convergent representation valid in a polydisc, 


(30) PG .w)S So fanz | Meh Rls. 


m,n>0 
for some R, S > 0 (cf Appendix B.8: Several complex variables, p. 767). 


Theorem B.4 (Analytic Implicit Functions). Let F be bivariate analytic near (0, 0). 
Assume that F (0, 0) = fo,o = 0 and F/, (0,0) = fo,1 4 0. Then, there exists a unique 
function f (z) analytic in a neighbourhood |z| < p of 0 such that f (0) = 0 and 


F(z, f(@)) =0, Izl <p. 
> B.16. Proofs of the Implicit Function Theorem. See Hille’s book [334] for details. 


(i) Proof by residues. Make use of the principle of the argument and Rouché’s Theorem to 
see that the equation F(z, w) has a unique solution near 0 for |z| small enough. Appeal then to 
the result, based on the residue theorem, that expresses the sum of the solutions to an equation 
as a contour integral: with C a small enough contour around 0 in the w—plane, one has 


1 F/ (z, w) 
31 = — ved 
a? F@) 2in [ me F(z, w) os 


(Note IV.39, p. 270), which is checked to represent an analytic function of z. 
(ii) Proof by majorant series. Set G(z, w) := w— tee F(z, w). The equation F(z, w) = 0 


becomes the fixed-point equation w = G(z,w). The bivariate series G has its coefficients 
dominated termwise by those of 


A w 


—-A-A-—. 
( —z/R)d — w/S) S 
The equation w = G (z, w) is quadratic. It admits a solution f(z) analytic at 0, 


ee z  A(A2+ AS + S$?) z? 
=A 
f@=A_R+ 2 R2 


G(z, w) = 
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whose coefficients dominate termwise those of f. 
(iii) Proof by Picard’s method of successive approximants. With G as before, define the 


sequence of functions 
go(z):=9;  $j41@) = GZ, 6; (2), 
each analytic in a small neighbourhood of 0. Then f(z) can be obtained as 


CO 
f@ = lim 4,@ = ¢o@— >) (¢j)@ — o)41@), 
jroo =) 
which is itself checked to be analytic near 0 by the geometric convergence of the series. J 


Weierstrass Preparation. The Weierstrass Preparation Theorem (WPT) also fa- 
miliarly known as Vorbereitungssatz is a useful complement to the Implicit Function 
Theorem. 


Given a collection z = (z,,..., Zm) of variables, we designate as usual by C[[z]] 
the ring of formal power series in indeterminates z. We let C{z} denote the subset of 
these that are convergent in a neighbourhood of (0, ..., 0), i.e., analytic (cf Appen- 


dix B.8: Several complex variables., p. 767). 


Theorem B.5 (Weierstrass Preparation). Let F = F(z1,...,Zm) in C[[z]] (respec- 
tively, C{z}) be such that F (0, ...,0) = 0 and F depends on at least one of the z; with 
J >2 (ie, FO, z2,...,%m) is not identically 0). Define a Weierstrass polynomial to 
be a polynomial of the form 


W(Z= za 4 gizt! 


++ 8a, 
where g; € C[[z2,..-, 2m] (respectively, g; € C{z2,..-,Zm}), with gj(0,...,0) = 
0. Then, F admits a unique factorization 


F (21, 22, +++ Zm) = W(z1)- X(Z1,---52m)s 


where W(z) is a Weierstrass polynomial and X is an element of C[[z1, ..., Zm]] (re- 
spectively, C{z1,...,Zm}) satisfying X(0,0...,0) £0. 


> B.17. Weierstrass Preparation: sketch of a proof. An accessible proof and a discussion of 
the formal algebraic result are found in Abhyankar’s lecture notes [2, Ch. 16]. 

The analytic version of the theorem is the one of use to us in this book. We prove it in the 
representative case where m = 2 and write F(z, w) for F(z1, Z2). First, the number of roots of 
the equation F(z, w) = 0 is given by the integral formula 


Lf Fate.w) 


32 — | ——dv, 
(32) 2iz Jy F(Z,w) 


where y is a small contour encircling 0 in the w-plane. There exists a sufficiently small open 
set © containing 0 such that the quantity (32), which is an analytic function of z while being 
an integer, is constant, and thus necessarily equal to its value at z = 0, which we call d. The 
quantity d is the multiplicity of 0 as a root of the equation F(0, w) = 0. In other words, we 
have shown that if F'(0, w) = 0 has d roots equal to 0, then there are d values of w near 0 
(within y ) such that F(z, w) = 0, provided z remains small enough (within Q). 
Let yj, ..., yg be these d roots. Then, we have for the power sum symmetric functions, 
r d 1 Pigs w) 
Yy tet ty, din |, FG, w) w dw, 


which are analytic functions of z when z is sufficiently near to 0. There results from relations be- 
tween symmetric functions (Note II.64, p. 88) that y;,..., y, are the solutions of a polynomial 
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equation with analytic coefficients, W, which is a uniquely defined Weierstrass polynomial. 
The factorization finally results from the fact that F./W has removable singularities. 

In essence, by Theorem B.5, functions implicitly defined by a transcendental 
equation (an equation F = 0) are locally of the same nature as algebraic functions 
(corresponding to the equation W = 0). In particular, form = 2, when the solu- 
tions have singularities, these singularities can only be branch points and companion 
Puiseux expansions hold (Section VII. 7, p. 493). The theorem acquires even greater 
importance when perturbative singular expansions (corresponding to m > 3) become 
required for the purpose of extracting limit laws in Chapter IX. 


> B.18. Multivariate implicit functions. The following extension of Theorem B.4 is important, 
with regard to the solution of systems of equations (Section VII. 6, p. 482). Its statement [104, 
§IV.5] makes use of the notion of analytic functions of several variables (Appendix B.8, p. 767). 
Theorem B.6 (Multivariate implicit functions), Let fj(x1,...,%m3Z1,---,Zp), with i = 
1,...,m, be analytic functions in the neighbourhood of a point xj = aj, ZK = Ck. Assume that 
the Jacobian determinant defined as 

J:=d hi 

:= det | —— 

OX j 


is non-zero at the point considered. Then the equations (in the x ;) 


Yi = fiQ1,---5%m3 Z1,+++5Zp)s i=l,...,m, 
admit a solution with the x ; near to the aj, when the zx are sufficiently near to the cx and the y; 
near to the bj := fj(@1,---,4m3 C1, +++, Cp): one has 
Xj = 8jV1o--+5 Vs Z15+++ > Zp)s 
where each g ; is analytic in a neighbourhood of the point (by, ...,bm3¢1, +++, Cp): 


The basic idea is that the linear approximations expressed by the Jacobian matrix (4) 
Jj 


can be inverted. Hence the x ; depend locally linearly on the y;, zx; hence they are analytic. <i 


B.6. Laplace’s method 


The method of Laplace serves to estimate asymptotically real integrals depending 
on a large parameter n (which may be an integer or a real number). Although it is 
primarily a real analysis technique, we present it in detail, given its relevance to the 
saddle-point method, which deals instead with complex contour integrals. 


Case study: a Wallis integral. In order to demonstrate the essence of the method, 
consider first the problem of estimating asymptotically the Wallis integral 


m/2 
(33) In =) (cosx)”" dx, 


—H {2 
as n — +00. The cosine attains its maximum at x = O (where its value is 1), and 
since the integrand of J, is a large power, the contribution to the integral outside any 
fixed segment containing 0 is exponentially small and can consequently be discarded 
for all asymptotic purposes. A glance at the plot of cos” x as n varies (Figure B.2) also 
suggests that the integrand tends to conform to a bell-shaped profile near the centre as 
n increases. This is not hard to verify: set x = w/,/n, then a local expansion yields 


2 
(34) cos” x = exp(n log cos(x)) = exp (-5 + ow) ; 
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Figure B.2. Plots of cos” x [left] and cos”(w/,/n) [right], for n = 1..20. 


the approximation being valid as long as w = O(n'/4). Accordingly, we choose 
(somewhat arbitrarily) 


Jo A/G 
Ky := n!/ ; 


and define the central range by |w| < «,. These considerations suggest to rewrite the 


integral J, as 
1. preva? w \" 
I, = —= (cos =) dw, 
Vn Jan fn/2 Jn 


and expect under this new form an approximation by a Gaussian integral arising from 
the central range. 


Laplace’s method proceeds in three steps. 
(i) Neglect the tails of the original integral. 
(ii) Centrally approximate the integrand by a Gaussian. 
(iii) Complete the tails of the Gaussian integral. 


In the case of the cosine integral (33), the chain is summarized in Figure B.3. Details 
of the analysis follow. 


(i) Neglect the tails of the original integral: By (34), we have 


Kn 1 455 
cos” (= ~ exp{—=n!/>), 
oad, 


and, since the integrand is unimodal, this exponentially small quantity bounds the 
integrand throughout |w| > x,y, that is, on a large part of the integration interval. This 
gives 


+kn/ Jn 1 
(35) h= / cos” xdx +O (exw ( - =) ; 
—Kn/ Jn 2 


and the error term is of the order of exp(— zn MS), 
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a /2 1 S/n n 
| cos xdx = — | (cos =) dw Set x = w/./n; choose ky = 71/10 
—1x/2 nJ—FJ/n Jn 
=[- ae (Nealedt the tail 
~ — cos —= w eglect the tails 
Pilem\ Va s 
~ cs J. enw dw [Central approxim.] 
Vn JHKy 
1 ae 279 
~ — / e  / dw [Complete the tails] 
nN J—oo 
2a 
ae 


Figure B.3. A typical application of the Laplace method. 


(ii) Centrally approximate the integrand by a Gaussian: In the central region, we 


have 
(l +Kn/J/n 
dT; Ms is i cos” x dx 
—Kn/Jn 
Ie ina aeee 14 
= — e” exp (Om w )) dw 
(36) vn Kn 


+Ky 
= e 2/2 (1+ 007w *)) dw 
ai al. Kn 
rKy 
_ e 2/2 dw + O(n— 319), 
ra =" 
given the uniformity of approximation (34) for w in the integration interval. 


(iii) Complete the tails of the Gaussian integral: The incomplete Gaussian inte- 
gral in the last line of (36) can be easily estimated once it is observed that its tails are 
small. Precisely, one has, for W > 0, 


Te ew /2 dive wwe ne oe /2 dh = ewe 
w 0 2 


(by the change of variable w = W + h). Thus, 


+Kn 2 +00 2 1 
(37) / e 2 dy = / e” 2 dw + O (exp ( 7 =K2) ; 
an as 2 


It now suffices to collect the three approximations, (35), (36), and (37): we have 
obtained in this way. 


(38) —s [- eo”? dw + O(n-") = ae + O(n*”) 
n Jn ae = - : 


These three steps comprise Laplace’s method. 


> B.19. A complete asymptotic expansion. In the asymptotic scale of the problem, the expo- 
nentially small errors in the tails can be completely neglected; the main error in (38) then arises 


from the central approximation (34), and its companion O(w*n-!) term. This can easily be 
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improved and it suffices to appeal to further terms in the expansion of logcos x near 0. For 
instance, one has (with x = w/./n): 
—w2 = 
cos" x =e” /? (1 —w4/12n+ O(n-7w8® ) i 


Proceeding as before, we find that a further term in the expansion of J, is obtained by consid- 
ering the additive correction 


1 “FO9 —w?/2 w4 1 
SS d = == 
LO ie [. . nfo YN Bn” 
Qn 1 a 
n= f—-J—zt+o@), 
n 8n 


A complete asymptotic expansion in the scale n—"/2 3/2, n-9/2, can easily be obtained 
in this way. dq 


so that 


> B.20. Wallis integrals, central binomials, and the squaring of the circle. The integral I; is an 
integral considered by John Wallis (1616-1703). It can be evaluated through partial integration 


or by its relation to the Beta integral (Note B.10, p. 747) as In = T(3) (3 + 5)/T3 + D. 
There results (n +> 2n): 


2n Q2n 1 1 5 
~ 1— + + —-e), 
n Jan 8n = 128n2—- 10243 


which is yet another avatar of Stirling’s formula. Wallis’ evaluation, when combined with its 
asymptotic estimate, is, in Euler’s terms, a formula for “squaring the circle” 


am  2-4-4-6-6-8-8- 10-10 
4 3-3-5-5-7-7-9-9-11 
albeit one that cannot be finitely implemented with ruler and compass. dq 


> 


General case of large powers. Laplace’s method applies under general condi- 
tions to integrals involving large powers of a fixed function. 


Theorem B.7 (Laplace’s method). Let f and g be indefinitely differentiable real- 
valued functions defined over some compact interval I of the real line.  As- 
sume that |g(x)| attains its maximum at a unique point xo interior to I and that 


f 0), g(xo), 2” (xo) 4 0. Then, the integral 
pe "d 
[ Feoecon as 


admits a complete asymptotic expansion: 


Qn , Oj — _8"@o) 
(39) In ~ yf FF @0)8 0) M+ ai . a= (xo) 


> B.21. Proof of Laplace’s method. Assume first that f(x) = 1. Then, one chooses xy as a 
function tending slowly to infinity like before (x, = n'/10 is suitable). It suffices to expand 


Xo+Kn n 
i) =, otkn/ Jn elt log g(x) gy, 
xo—Kn/ Jn 
is exponentially small. Set first x = xg + X and 
x2 


L(X) := log g(xp + X) — log g(xq) + A> 


as the difference J, — ihe 
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so that, with w = X./n, the central contribution becomes: 


( — g(xo)” f* ew? /2 nL (w/VA) gry 
Ao ee 


a ae 
Then, expanding L(X) to any order M, 
M-1 

L(X)= >) 6j;X/ + 0(X™), 
j=3 


shows that e”£(/V”) admits a full expansion in descending powers of ./n: 


4 2,6 
etl (w/J/n) ~14 ew? Fs 2t4w + &30 
Jn 2n 


There, by construction, the coefficient of n—*/2 isa polynomial E;,(w) of degree 3k. This 
expression can be truncated to any order, resulting in 


M-1 3M 
(1) g(xo)” fk ~4Aw?/2 Ex (w) l+w 
Ii, = — 1 > —— + 0| ——.— dw. 
Vn JaxKy : i pant nk/2 a nM/2 i 


One can then complete the tails at the expense of exponentially small terms since the Gaussian 
tails are exponentially small. 

The full asymptotic expansion is revealed by the following device: for any power series 
h(w), introduce the Gaussian transform, 


@ 2 
Sf] 7 e”/? F(w) du, 
—oo 
which is understood to operate by linearity on integral powers of w, 


G[w2"] = 1-3---(2r —1)V2z, 6[w2"t!] =0. 


Then, the complete asymptotic expansion of Jy is obtained by the formal expansion 


8x0)" ~3/2,,. 3.7 (4-1/2 Pinte : 
(40) Val 6 [exp (2 Pw yL( Pwy))], L(X) := xl: yo Ta 


The addition of the prefactor f(x) (omitted so far) induces a factor f(x) in the main 
term of the final result and it affects the coefficients in the smaller order terms in a computable 
manner. Details are left as an exercise to the reader. <q 


D> B.22. The next term? One has (with fj := f/) (xo), ete): 
Invin —943 fy + 121? fo + 12Af1g3 + 3A fogs +583 fo 
V2n g(x)” = 2413n 


which is best determined using a symbolic manipulation system. dq 


+ O(n), 


The method is amenable to a large number of extensions. Roughly it requires 
a point where the integrand is maximized, which induces some sort of exponential 
behaviour, local expansions then allowing for a replacement by standard integrals. 
> B.23. Special cases of Laplace’s method. When f (xg) = 0, the integral normalizes to an 
integral of the form [wre */2, If g/(xo) = g(x) = 9%) (xp) = 0 but g" (x9) 4 0 
then a factor (1/4) replaces the characteristic ./x = I'(1/2). [Hint: te exp(—w? Jw* dw = 


BT (a + Df cate If the maximum is attained at one end of the interval J = [a, b] while 
g(xo) = 0, g’ (x9) & 0, then the estimate (39) must be multiplied by a factor of 1/2. If the 
maximum is attained at one end of the interval J while g’(xg) 4 0, then the right normalization 
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is w = x/n and the integrand is reducible to an exponential e~. Here are some dominant 
asymptotic terms: 


x9 #a,b  gl"(xo) £0, f(xo) =0 spas 810)" AF (0) + f/0)8”" (Xo) 
sp #ab aMay) =O,8%a0) £0 | DY aig Foodetoo” (2° = "08? ) 
x =4 — f(x) #0,8/(0) #0 | Fag fo)s@0)"*". 


A similar analysis is employed in Section VIII. 10, p. 600, when we discuss coalescence cases 
of the saddle-point method. dq 


Example B.2.  Stirling’s formula via Laplace’s method. Start from an integral representation 


involving n!, namely, 
‘ ¥ NX Nn n) 
In i= [ e x dx = ani ‘ 
This is a direct case of application of the theorem, except for the fact that the integration interval 
is not compact. The integrand attains its maximum at xg = | and the remainder integral shea is 
accordingly exponentially small as proved by the chain 


0° 0° x\n 
| eM dy = (2672) (1 + 5) e™* dx [Ix x +2] 
2 0 


CO 
< (2e~?)" i et /2 MX dy = 2 (re-2)" [log(1 + x/2) < x/2]. 
0 n 


Then the integral from 0 to 2 is amenable to the standard version of Laplace’s method as stated 
in Theorem B.7 to the effect that 


ni=n"e"J/2nn (: +0 (;)) 


The asymptotic expansion of J, is derived from (40) and involves the combinatorial GF 


2 
(41) H(z, u) = v0(« (vee =e 4 eee =)). 


The noticeable fact is that H(z, u) is the exponential BGF of generalized derangements involv- 
ing no cycles of length 1 or 2, with z marking size and u marking the number of cycles: 


n 
z 
A(z,u) = = hin, eus = 1+ guzd+huct+s uz>+( gut peu?)2o+ (Fut pyu?)z! + i, 
n,k>0 


Then, a complete asymptotic expansion of Jy is obtained by applying the Gaussian transform 
6 to H(wy, —y~*) (with y = n—!/2), resulting in 


1 1 139 
ni~nne "J/2nn (1+ + a -). 
12n  288n2—-51840n3 


Proposition B.1 (Stirling’s formula). The factorial function admits the asymptotic expansion: 


x!=T(xt+1l)~wx*te*v2ax flit > £6 (x > +00). 
xd 
qzl 
so 


The coefficients satisfy cq = Ss —______, 
+k ! 
meta +h)! 


permutations of size n having k cycles, all of length > 3. 


2q+2k,k» Where hy, counts the number of 
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The derivation above is due to Wrench (see [129, p. 267]). 0.0.0... eee cece eee eee eens | 


The scope of the method goes much beyond the case of integrals of large powers. 
Roughly, what is needed is a localization of the main contribution of an integral to a 
smaller range (“Neglect the tails”) where local approximations can be applied (“Cen- 
trally approximate”). The approximate integral is then finally estimated by completing 
back the tails (“Complete the tails’). 

The Laplace method is excellently described in books by de Bruijn [143] and 
Henrici [329]. A thorough discussion of special cases and multidimensional integrals 
is found in the book by Bleistein and Handelsman [75]. Its principles are fundamental 
to the development of the saddle-point method in Chapter VII. 
> B.24. The classical proof of Stirling’s formula. This proceeds from the integral 


00 
Jn= [ e*x"dx (=n!) 
0 


The maximum is at x9 = 7 and the central range is now n + x,./n. Reduction to a Gaussian 
integral follows, but the estimate is no longer a direct application of Theorem B.7. J 


Laplace’s method for sums. The basic principles of the method of Laplace (for 
integrals) can often be recycled for the asymptotic evaluation of discrete sums. Take a 
finite or infinite sum S, defined by 


Sn = Do t(n, k). 
k 
A preliminary task consists in working out the general aspect of the family of num- 
bers {t(n, k)} for fixed (but large) n as k varies. In particular, one should locate the 
value ky = ko(n) of k for which t(n, k) is maximal. In a vast number of cases, tails 
can be neglected; a central approximation f(n, k) of t(n, k) for k in the “central” re- 
gion near kg can be determined, frequently under the form [remember that we use in 
this book ‘~*’ in the loose sense of “approximately equal’’] 


7a, k) © s(n)p (~~). 


On 
where ¢ is some smooth function while s(7) and o, are scaling constants. The quan- 
tity o, indicates the range of the asymptotically significant terms. One may then 
expect 


Then provided o, — ov, one may further expect to approximate the sum by an inte- 
gral, which after completing the tails, gives 


Sn © s(n)on is P(t) dt. 


Example B.3. Sums of powers of binomial coefficients. Here is, in telegraphic style, an appli- 
cation to sums of powers of binomial coefficients: 


+n r 
(r) 2n 
eae a i) 


k=—-n 
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The largest term arises at kg = 0. Furthermore, one has elementarily 


(an) = (1 7)---(1— &) 


Ge Cia) 


By the exp—log transformation and the expansion of log(1 + x), one has 
( if 2 
(42) ae =exp| -— + (en) J. 
( a n 
n 


This approximation holds for k = o(n2/ 3), where it provides a Gaussian approximation 
(P(x) = ent) with a span of o, = ./n. Tails can be neglected, so that 


say with |k| < n!/2xy where x, =n'!/!9, Then approximating the Riemann sum by an integral 


and completing the tails, one gets 


2rn 


Si~ i laf eT dip, thatis, ST * 
. n —oo , _% Jr 


(nay @ DP, 


r 


which is Our final estimate: ~ 20. 22.606 4.6 seen ece heh 4 We Gs See deh as Vie de ede AS || 


> B.25. Elementary approximation of Bell numbers. The Bell numbers counting set partitions 
(p. 109) are 


CO Ln 


Bn= ni[z Jee —! =e! _—, 
k} 
k=0 
The largest term occurs for k near e” where u is the positive root of the equation ue’ =n + 1; 
the central terms are approximately Gaussian. There results the estimate, 


(43) By =nte~!(22)7 1/7. +:u7!)7!/? exp (eva —ulogu) — 5") (1+ O(e~")). 


This alternative to saddle-point asymptotics (p. 560) is detailed in [143, p. 108]. dq 


B.7. Mellin transforms 


The Mellin transform? of a function f defined over R50 is the complex-variable 
function f*(s) defined by the integral 


(44) Pes i fone dx. 


This transform is also occasionally denoted by M[f] or M[f (x); s]. Its importance 
devolves from two properties: (i) it maps asymptotic expansions of a function at 0 
and +o to singularities of the transform; (ii) it factorizes harmonic sums (defined 
below). The conjunction of the mapping property and the harmonic sum property 
makes it possible to analyse asymptotically rather complicated sums arising from a 


2In the context of this book, Mellin transforms are useful in analyses relative the longest run problem 
(p. 311), the height of trees (p. 329) polylogarithms (p. 408), and integer partitions (p. 576). They also serve 
to establish fundamental asymptotic expansions, as in the case of harmonic and factorial numbers (below). 
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linear superposition of models taken at different scales. Major properties are summa- 
rized in Figure B.4. In this brief review, detailed analytic conditions must be omitted: 
see the survey [234] as well as comments and references at the end of this entry. 

It is assumed that f is locally integrable. Then, the two conditions, 


fo) = 0G"), — f@) = 0"), 


=> t+ 
guarantee that f* exists for s in a strip, 
s € (—u,—v), ie., —u <R(s) < -v. 


Thus existence of the transform is granted provided v < u. The prototypical Mellin 
transform is the Gamma function discussed earlier in this appendix: 


CO 
T(s) if e* x8! dx = Mle: s], 0 <R(s) < co. 
0 


Similarly f(x) = (1 + x)7! is O(x°) at 0 and O(x~!) at infinity, and hence its 
transform exists in the strip (0, 1); it is in fact z/sinzs, as a consequence of the 
Eulerian Beta integral. The Heaviside function defined by H(x) := [0 < x < 1] 
exists in (0, +00) and has transform 1/s. 


Harmonic sum property. The Mellin transform is a linear transform. In addition, 
it satisfies the simple but important rescaling rule: 


fac) &S f(s) implies f(ux) & pw f*(), 


for any 4 > O. Linearity then entails the derived rule 


(45) Sac f(uex) (San) re. 
k k 


valid a priori for any finite set of pairs (A, 4x) and extending to infinite sums when- 
ever the interchange of { and >° is permissible. A sum of the form (45) is called 
a harmonic sum, the function f is the “base function”, the 2 values are the “ampli- 
tudes” and the yw values the “frequencies”. Equation (45) then yields the “harmonic 
sum rule”: The Mellin transform of a harmonic sum factorizes as the product of the 
transform of the base function and a generalized Dirichlet series associated to ampli- 
tudes and frequencies. Harmonic sums surface recurrently in the context of analytic 
combinatorics and Mellin transforms are a method of choice for coping with them. 
Here are a few examples of application of the harmonic sum rule (45): 


sy ex? > 1T(s/2)e(s) a e782! > me) 


Rs)>1 2 R(s)>0 1-275 


k1 k>0 
4 1 1 

log ke VE ~¢"(s/2T = 2. 

Zs og k)e Anan o"(s/2)T(s) Di iktx) pecie c( One 


> B.26. Connection between power series and Dirichlet series. Let (f,) be a sequence of 
numbers with at most polynomial growth, f, = O(n"), and with OGF f(z). Then, one has 


n 1 = ra 
> 2- i f (e ae lax, Ris) >r +. 
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Function (f (x)) Mellin transform (f*(s)) 
[o.@) 
f(x) | f(x)x°—! dx definition, s € (—u, —v) 
1 c+ioo ° 
— f*(s)x ‘ds f*(s) inversion th., —u <c < —v 
21 Je-ico 
SA fi) DOE NG linearity 
i i 
f (ux) wu f*(s) scaling rule (4 > 0) 
1 o 
x? f(x") ee i ¢ Bi *) power rule 
0 0 
DA fix) (407") - f*(s) harmonic sum rule (1; > 0) 
poo 00 
i A(t) f (tx) dt i A(t)t~* dt - f*(s) harmonic integral rule 
0 0 
f (x) logk x ak f*(s) diff. I, k € Zs, 05 := 4 
(-1)*T(s) ; d 
ak f(x) TG P-k diff. Ike Zs9,a:= 4 
k 
ee Cpe i 
eet (log x) tie G+ aye mapping: x — 0, left poles 
-1 k-Ipy 
~  xF (log x) i mapping: x — oo, right poles 
x— +00 sof (s + Byer 


Figure B.4. A summary of major properties of Mellin transforms. 


For instance, one obtains the Mellin pairs 


—Xx 


(46) 


—e* 


M c(sF(s) (ls) > 1), log + 


—e* 


M cst DIG) (&G) > 0). 


These serve to analyse sums or, conversely, deduce analytic properties of Dirichlet series. <J 


Mapping properties. Mellin transforms map asymptotic terms in the expansions 
of a function f at 0 and +00 onto singular terms of the transform f*. This property 
stems from the basic Heaviside function identities 


1 
H(x)x® ac 
s 


_ (s € (—a, +00)), (1—H(x))x? ms — 


1 
Rai (s € (co, —f)), 


as well as what one obtains by differentiation with respect to a, f. 
The converse mapping property also holds. Like for other integral transforms, 
there is an inversion formula: if f is continuous in an interval containing x, then 


(47) 


fase fl 


c+tioo 


f*(s)x~ ds, 


where the abscissa c should be chosen in the “fundamental strip” of f; for instance 
any c satisfying —u <c < —v with u, v as above is suitable. 

In many cases of practical interest, f* is continuable as a meromorphic function 
to the whole of C. If the continuation of f* does not grow too fast along vertical lines, 
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then one can estimate the inverse Mellin integral of (47) by residues. This corresponds 
to shifting the line of integration to some d ¥ c and taking poles into account by the 
residue theorem. Since the residue at a pole so of f* involves a factor of x~*, the 
contribution of so will give useful information on f(x) as x — oo if so lies to the 
right of c, and on f(x) as x — 0 if sg lies to the left. Higher order poles introduce 
additional logarithmic factors. The “dictionary” is simply 

1 M7! Gave 
48 ee ee SE 
oo (s — so)Kt! k! 
where the sign is ‘+’ for a pole on the left of the fundamental strip and ‘—’ for a pole 
on the right. 


x (log x)*, 


Mellin asymptotic summation. The combination of mapping properties and the 
harmonic sum property constitutes a powerful tool of asymptotic analysis, as shown 
by the examples and the notes below. 


Example B.4. Asymptotics of a simple harmonic sum. Let us first investigate the pair 


1 1 1 
F(x) = Y > F*(s) =e C(s), 
ari 1+k?x2 2 sin 5a 


where F* results from the harmonic sum rule and has fundamental strip (1,2). The function 
F* is continuable to the whole of C with poles at the points 0, 1, 2 and 4, 6, 8,.... The trans- 
form F* is small towards infinity, so that application of the dictionary (48) is justified. One 
finds 


1 1 M va 1 
F Ba Ee ee ; F ee ee oe 
) x30+ 2x 2 He? @) x>+00 6x2 90x4 - 
where the expansion at 0 is valid for any M > 0. 0... eee e ec neee ene ee a 


Example B.5. — Asymptotics of a dyadic sum. A particularly important quantity in analytic 
combinatorics is the following harmonic sum, stated here together with its Mellin transform: 


= — x2 7 T(s) 
W(x) = Di (l-e Vi O*(s) =- 5, 8 € (-1,0). 
k=0 


It occurs for instance in the analysis of longest runs in words (p. 311). The transform of e~* — 1 
is also '(s), but in the shifted strip (—1, 0). The singularities of ®* are at s = 0, where there is 


a double pole, at s = —1, —2,... which are simple poles, but also at the complex points 
_ 2ikn 
a log 2° 


The Mellin dictionary (48) can still be applied provided one integrates along a long rectangular 
contour that passes in-between poles. The salient feature is here the presence of fluctuations 
induced by the imaginary poles, since x“ = exp (—2ikn logy x), and each pole induces a 
Fourier element. All in all, one finds (any M > 0): 


1 
Ox) ~~ ogg x t+ 4 = 4+ P(X) + OC) 
(49) X— +00 log2 2 
Qiknx ik | i 
P(x) := > rl RIOR X 
e (ies) 
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-] n—-1 yn 
The analysis for x — 0 yields, in this particular case, B(x) ~ >> ee a which 
x0 = 1-27" n! 
n> 
would also result from expanding exp(—x /2k ) in @(x) and reorganizing the terms. ....... | 


Example B.6. — Euler-Maclaurin summation via Mellin analysis. Let f be continuous on 
(0, +00) and satisfy f(x) =y+00 O(x—!—), for some 6 > 0, and 


f(x) woo 2a fe . 
The summatory function F(x) satisfies 
F@):=> fx), FS) =c)f"*), 
n>1 


by the harmonic sum rule. The collection of (trimmed) singular expansions of f* at s = 
0, —1, —2,... is summarized by the formal expansion, conventionally represented by =: 


kee as f JO fi fo . 
y ae ( Ss Vg (4 aie ii eae % , 


Thus, by the mapping properties, provided F*(s) is small towards -tico in finite strips, one has 


1 se = 
F(x) ~ = t)dt+ iC(—j)x/, 
Wx of foarte d fecis 
j=0 
where the main term is associated to the singularity of F* at 1 and arises from the pole of ¢(s), 
with f*(1) giving the integral of f. The interest of this approach is that it is very versatile and 
allows for various forms of asymptotic expansions of f at 0 as well as multipliers like (-1)* ; 
log k, and so on; see [234] for details and Gonnet’s note [300] for alternative approaches. .. Hf 


> B.27. Mellin-type derivation of Stirling’s formula. One has the Mellin pair 


L(x) = Dy log (1+ =) - =, Pigg” Fes. weg, =. 
k>1 


ssin zs 


Note that L(x) = log(e~’?* /T(1 + x)). Mellin asymptotics provides 


a) 


1 1 1 
LGN, Se. 22 ipetety eS oes ee eS = 
ey, OE AE oh 5 OE MOEN ER gine igen a gh eS 


where one recognizes Stirling’s expansion of x!: 
; B> 
logx! ~  log(x*e*V2ax) + > yl 
et X— +00 bile “te 2 2n(2n — 1) 
n> 


(the By are the Bernoulli numbers). J 


> B.28. Mellin-type analysis of the harmonic numbers. For a > 0, one has the Mellin pair: 


7 1 1 pies _ T'(s)'(a—s) 
Kat) => (Fe _- a) Kgs) = C(a Tar ¢ 5 


This serves to estimate harmonic numbers and their generalizations, for instance, 


H logn + : a eh ene ae eee 
~ logn -—- —n“ ~ logn = —-- 
Egg cee tale ae ak Br © De ibe Ta0ne 


since K,(n) = Hy. dq 


"9 
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General references on Mellin transforms are the books by Doetsch [168] and Wid- 
der [605]. The term “harmonic sum” and some of the corresponding technology orig- 
inates with the abstract [253]. This brief presentation is based on the survey article 
by Flajolet, Gourdon, and Dumas [234] to which we refer for a detailed treatment; 
see also the self-contained treatment by Butzer and Jansche [100]. Mellin analysis of 
“harmonic integrals” is a classical topic of applied mathematics for which we refer 
to the books by Wong [614] and Paris—Kaminski [472]. Valuable accounts of proper- 
ties of use in discrete mathematics and analysis of algorithms appear in the books by 
Hofri [335], Mahmoud [429], and Szpankowski [564]. 


B.8. Several complex variables 


The theory of analytic (or holomorphic) functions of one complex variables ex- 
tends non-trivially to several complex variables. This profound theory has been largely 
developed in the course of the twentieth century. Here we shall only need the most 
basic concepts, not the deeper results, of the theory. 

Consider the space C” endowed with the metric 


m 
2 
(Gi zal=> or, 


under which it is isomorphic to the Euclidean space R”. A function f from C” to C 
is said to be analytic at some point a if in a neighbourhood of a it can be represented 
by a convergent power series, 


(50) f@= 2, flea)" = SE Sit seostm Zi — a1)" ++ Gm = an). 


There and throughout the theory, extensive use is made of the multi-index convention, 
as encountered in Chapter III, p. 165. 

An expansion (50) converges in a polydisc [fz —aj| < rj}, forsomer; > 0. 
A convergent expansion at (0, ..., 0) has its coefficients majorized in absolute value 
by those of a series of the form 


m 


le =e n sy | Tiaras Chae faeeee ak 


Closure of analytic functions under sums, products, and compositions results from 
standard manipulations of majorant series (see p. 250 for the univariate case). Finally, 
a function is analytic in an open set Q C C” iff it is analytic at each a € Q. 

A remarkable theorem of Hartogs asserts that f(z) with z € C” is analytic jointly 
in all the z; (in the sense of (50)) if it is analytic separately in each variable z;. (The 
version of the theorem that postulates a priori continuity is elementary.) 

As in the one-dimensional case, analytic functions can be equivalently defined by 
means of differentiability conditions. A function is C-differentiable or holomorphic 
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at aif, as Az > 0 in C”, one has 

m 

flat Az)— fla) =>) cjAzj +0 (\Az)). 

j=l 
The coefficients cj are the partial derivatives, c; = 0,, f(a). The fact that this relation 
does not depend on the way Az tends to 0 implies the Cauchy—Riemann equations. 
In a way that parallels the single variable case, it is proved that two conditions are 
equivalent: f is analytic; f is complex-differentiable. 

Iterated integrals are defined in the natural way and one finds, by a repeated use 
of calculus in a single variable, 

i LO 
6) =a ff eam ecm en 
where C; is a small circle surrounding z; in the z ;—plane. By differentiation under the 
integral sign, Equation (51) also provides an integral formula for the partial derivatives 
of f, which is the analogue of Cauchy’s coefficient formula. Iterated integrals are 
independent of details of the “polypath” on which they are taken, and uniqueness of 
analytic continuation holds. 

The theory of functions of several complex variables develops in the direction of 
an integral calculus that is much more powerful than the iterated integrals mentioned 
above; see, for instance, the book by Aizenberg and Yuzhakov [8] for a multidimen- 
sional residue approach. Egorychev’s monograph [187] develops systematic applica- 
tions of the theory of functions of one or several complex variables to the evaluation 
of combinatorial sums. Pemantle together with several coauthors [474, 475, 476] has 
launched an ambitious research programme meant to extract the coefficients of mero- 
morphic multivariate generating functions by means of this theory, with the ultimate 
goal of obtaining systematically asymptotics from multivariate generating functions. 
By contrast, see especially Chapter IX, we can limit ourselves to developing a pertur- 
bative theory of one-variable complex function theory. 

In the context of this book, the basic notion of analyticity in several complex vari- 
ables serves to confer a bona fide analytic meaning to multivariate generating func- 
tions. Basic definitions are also needed in the context of functions f defined implicitly 
by functional relations of the form H(z, f) = 0 or H(z, u, f) = 0, where analytic 
functions of two or more complex variables make an appearance. (See in particular the 
discussion of the analytic Implicit Function Theorem and the Weierstrass Preparation 
Theorem in this appendix, p. 753.) 


APPENDIX C 


Concepts of Probability Theory 


This appendix contains entries arranged in logical order regarding the following topics: 


Probability spaces and measure; Random variables; Transforms of distributions; 

Special distributions; Convergence in law. 
In this book we start from probability spaces that are finite, since they arise from objects of a 
fixed size in some combinatorial class (see Chapter III and Appendix A.3: Combinatorial prob- 
ability, p. 727 for elementary aspects), then need basic properties of continuous distributions in 
order to discuss asymptotic limit laws. The entries in this appendix are related principally to 
Chapter IX of Part C (Random Structures). They present a unified framework that encompasses 
discrete and continuous probability distributions alike. For further study, we recommend the su- 
perb classics of Feller [205, 206], given the author’s concrete approach, and of Billingsley [68], 
whose coverage of limit distributions is of great value for analytic combinatorics. 


C.1. Probability spaces and measure 


An axiomatization of probability theory! was discovered in the 1930s by Kol- 
mogorov. A measurable space consists of a set Q, called the set of elementary events 
or the sample set and a o-algebra A of subsets of Q called events (that is, a collec- 
tion of sets containing 8 and closed under complement and denumerable unions). A 
measure space is a measurable space endowed with a measure « : A +» Rso that 
is additive over finite or denumerable unions of disjoint sets; in that case, elements 
of A are called measurable sets. A probability space is a measure space for which the 
measure satisfies the further normalization 4 (Q) = 1; in that case, we also write P for 
hu. Any set S C Q such that “(S) = 1 is called a support of the probability measure. 
These definitions given above cover several important cases. 


(i) Finite sets with the uniform measure (also known as “counting” measure). 


In this case, Q is finite, all sets are in A (i.e., are measurable), and (|| - || denotes 
cardinality) 
M(E) := IE 
S| 


Non-uniform measures over a finite set QO are determined by assigning a non-negative 
weight p(@) to each element of Q (with )),,-9 p(w) = 1) and setting 


H(E) = DP). 
ecE 


(We also write P(e) for P({e}) = w({e}) = p(e).) In this book, Q is usually the sub- 
class C,, formed by the objects of size n in some combinatorial class C. The uniform 
measure is usually assumed, although suitably weighted models often prove to be of 


'For this entry we refer to the vivid and well-motivated presentation in Williams’ book [609] or to 
many classical treatises such as those by Billingsley [68] and Feller [205]. 
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interest: see for instance in Chapter III the discussion of weighted word models and 
Bernoulli trials as well as the case of weighted tree models and branching processes. 

(ii) Discrete probability measures over the integers (supported by Z or Zso). In 
this case the measure is determined by a function p : Z +> Rso and 


u(E) = > ple), 
ecE 
with «(Z) = 1. (All sets are measurable.) More general discrete measures supported 
by denumerable sets of R can be similarly defined. 

(iii) The real line R equipped with the o-algebra generated by the open intervals 
constitutes a standard example of a measurable space; in that case, any member of 
the o-algebra is known as a Borel set. The measure, denoted by 4, that assigns to an 
interval (a, b) the value A(a, b) = b—a (and is extended non-trivially to all Borel sets 
by additivity) is known as the Lebesgue measure. The interval [0, 1] endowed with 1 
is a probability space. The line R itself is not a probability space since A(R) = +00. 


In the measure-theoretic framework, a random variable is a mapping X from 
a probability space Q (equipped with its o-algebra A and its measure Pg) to R 
(equipped with its Borel sets B) such that the preimage X~!(B) of any B € B lies 
in A. For B ¢€ B, the probability that X lies in B is then defined as 


P(X € B) := Pa(X7!(B)). 


Since the Borel sets can be generated by the semi-infinite intervals (—oo, x], this 
probability is equivalently determined by the function 


F(x) := P(X <x), 


which is called the distribution function or cumulative distribution function of X. It 
is then possible to introduce random variables directly by means of distribution func- 
tions, see the entry below, Random variables. 


Integration. The next step is to go from measures of sets to integrals of (real- 
valued) functions. Lebesgue integrals are constructed, first for indicator functions of 
intervals, then for simple (staircase) functions, then for non-negative functions, finally 
for integrable functions. One defines in this way, for an arbitrary measure y, the 
Lebesgue integral 


(1) [ fae. also written J feodues) or | feoman, 


where the last notation is often preferred by probabilists. The basic idea is to decom- 
pose the domain of values of f into finitely many measurable sets (A;) and, for a 
positive function f, consider the supremum over all finite decompositions (A;) 


(2) fe ee su int F(0)| ity 


(Thus Riemann integration proceeds by decomposing the domain of the function’s 
arguments while Lebesgue integrals decomposes the domain of values and appeals to 
a richer notion of measure for point sets.) 
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In (1) and (2), the possibility exists that 4 assigns a non-zero measure to cer- 
tain individual points. In such a context, the integral is sometimes referred to as 
the Lebesgue-Stieltjes integral. It suitably generalizes the Riemann-Stieltjes integral 
which, given areal valued function M, defines the following extension of the standard 
Riemann integral: 


3) / f(x) dM(x) = Hm 3 fs). 


There the B; form a finite partition of the domain in which the argument of f ranges, 
the limit is taken as the largest B, tends to 0, each x; lies in By, and Ag, (M) is the 
variation of M on By. 

The great advantage of Stieltjes (hence automatically of Lebesgue) integrals is to 
unify many of the formulae relative to discrete and continuous probability distributions 
while providing a simple framework adapted to mixed cases. 


C.2. Random variables 


A real random variable X is fully characterized by its (cumulative) distribution 

function 

Fy (x) := P(X < x), 
which is a non-decreasing right-continuous function satisfying F(—oo) = 0, 
F(+oo) = 1. 

A variable is discrete if it is supported by a finite or denumerable set. Almost all 
discrete distributions in this book are supported by Z or Zo. (An interesting excep- 
tion is the collection of distributions occurring in longest runs of words, Chapter IV, 
p. 308.) 

A variable X is continuous if it assigns zero probability mass to any finite or 
denumerable set. In particular, it has no jump. An easy theorem states that any distri- 
bution function can be decomposed into a discrete and a continuous part, 


F(x) =c F%(x) + F(x), cqto=l. 


(The jumps must sum to at most |, hence their set is at most denumerable.) A variable 
is absolutely continuous if it assigns zero probability mass to any Borel set of mea- 
sure 0. In that case, the Radon—Nikodym Theorem asserts that there exists a function 
w such that 


Fx(x) = i w(y) dy. 


(There, in all generality, the Lebesgue integral is required but the Riemann integral is 
sufficient for all practical purposes in this book.) The function w(x) is called a density 
of the random variable X (or of its distribution function). When Fy is differentiable 
everywhere it admits the density 


d 
w(x) = =F x(x), 


by the Fundamental Theorem of Calculus. 
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> C.1. The Lebesgue decomposition theorem. It states that any distribution function F(x) 
decomposes as 
F(x) =, F(x) + co F® + c3FS(x), cy to+0c3= 1, 

where F4 is discrete, F*° is absolutely continuous, and F° is continuous but singular, i.e., it 
is supported by a Borel set of Lebesgue measure 0. Singular random variables are constructed, 
e.g., from the Cantor set. dq 

In this book, all combinatorial distributions are by nature discrete (and then sup- 
ported by Zo). All continuous distributions obtained as limits of discrete ones are, 
in our context, absolutely continuous and the qualifier “absolutely” is globally under- 
stood when discussing continuous distributions. 

If X is arandom variable, the expectation of a function g(X) is defined as 


3 (g(X)) = : dF), 


which involves the distribution function F of X. In particular the expectation or mean 
of X is E(X), and generally its moment of order r is 


uw = E(X’). 


(These quantities may not exist for r 4 0.) 
> C.2. Alternative formulae for expectations. If X is supported by R50: 


CO 
E(X) = | (1 — F(x)) dx. 
0 
If X is supported by Zs: 


E(X) = >) P(X > &). 
k>0 
Proofs are by partial integration and summation: for instance with py = P(X =k), 


E(X) = >° kpe = (pi + pot p3 tet (rot p3te-)t (ate )tee- 
k>1 


Similar formulae hold for higher moments. dq 


C.3. Transforms of distributions 


The Laplace transform of X (or of its distribution function F’) is defined by 


Ax(s) :=E (c*) = - e* dF (x). 


co 


(if F has a discrete component, then integration is to be taken in the sense of 
Lebesgue-Stieltjes or Riemann-Stieltjes.) The Laplace transform is also known as 
the moment generating function (see below for an existential discussion). The char- 
acteristic function is defined by 


ox(t)=E (c"*) = ie e!!* dF (x), 


—oo 


and it is a Fourier transform. Both transforms are formal variants of one another and 


x(t) = Ax(it). 
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If X is discrete and supported by Z, then its probability generating function (PGF) 
is, as defined as in Appendix A.3: Combinatorial probability, p. 727: 


Px(u) := u(u*) = DPX — kjuk. 


keZ 


As an analytic object this always exists when X is non-negative (supported by Zs), 
in which case the PGF is analytic at least in the open disc |u| < 1. If X € Z assumes 
arbitrarily large negative values, then the PGF certainly exists on the unit circle, but 
sometimes not on a larger domain. The precise domain of existence of the PGF as an 
analytic function depends on the geometric rate of decay of the left and right tails of 
the distribution, that is, of P(X = k) as k — +oo. The characteristic function of the 
variable X (and of its distribution function Fy) is 


x(t) = Ee) = Px) = DO P(X = be™. 


keZ 


It exists for all real values of t. The Laplace transform of the discrete variable X is 


Ax(s) = E(e**) = Px(e’) = > P(X = he™. 
keZ 


If X is acontinuous random variable with distribution function F(x) and density 
w(x), then the characteristic function is expressed as 


= B(eitX) = itx Bh 
x(t) (e"") fe w(x) dx 


and the Laplace transform is 


Ax(s) = Be) = fe w(xyae. 


The Fourier transform always exists for real arguments (by integrability of the Fourier 
kernel e'’ whose modulus is 1). The Laplace transform, when it exists in a strip, 
extends analytically the characteristic function via the equality éx (t) = Ax (it). The 
Laplace transform is also called the moment generating function since an alternative 
formulation of its definition, valid for discrete and continuous cases alike, is 


s 

Ax(s) = LEX) 

k>0 : 

which indeed represents the exponential generating function of moments. (We avoid 

this terminology in the text, because of a possible confusion with the many other types 
of generating functions employed in this book.) 

The importance of the transforms is due to the existence of continuity theorem by 

which convergence of distributions can be established via convergence of transforms. 

> C.3. Centring, scaling, and standardization. Let X be a random variable. Define Y = 5 


The representations as expectations of the Laplace transform and of the characteristic function 
make it obvious that 


by) = My ( ). Ay (s) = eM Ay ( i: 


t S 
oO oO 
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One says that Y is obtained from X by centring (by a shift of ) and scaling (by a factor of o). 
If « and o are the mean and standard deviation of X, then one says that Y is a standardized 
version of X. dq 


> C.4. Moments and transforms. The moments are accessible from either transform, 


(r) r d" y) r d" 
— E Y = A, ( = (-i ae t 
KORE = FAO] = Ci FeO] 
In particular, we have 
d d 
boo= A(s)| = —i — (0) 
ds s=0 dt t=0 
ae d 
2 ~ =_jYs =-— —d(t 
(4) a 7s2 (s) 7 ae ) ar 
2 2 
(pt, logi(s)) =- = log A(t) 
ds* dt? 
v=0 t=0 
The direct expression of the standard deviation in terms of log 1(s), called the cumulant gener- 
ating function, often proves computationally handy. dq 


[> C.5. Mellin transforms of distributions. The quantity M(s) := E(X5~!) is the Mellin trans- 
form of X or of its distribution function F, when X is supported by Rs (see Appendix B.7: 
Mellin transform, p. 762). In particular, if X admits a density, then this notion coincides with 
the usual definition of a Mellin transform. When it exists, the value of the Mellin transform at 
an integer s = k provides the moment of order k — 1; at other points, it provides moments of 
fractional order. dq 


> C.6. A “symbolic” fragment of probability theory. Consider discrete random variables sup- 
ported by Zy9. Let X, X1,... be independent random variables with PGF p(w) and let Y have 
PGF q(u). Then, certain natural operations admit a translation into PGFs. 


Operation PGF 
switch (Ben(4) > X | Y) | Ap@) +d —- Agu) 
sum X+Y plu): qu) 
Xy+-+-+Xn p(uy" 
random sum | X; +---+ Xy q(p(u)) 
ee up’ (u) 
size bias ox ) 
(‘“Bern” means a Bernoulli {0, 1} variable B, with P(1) = J; the switch is interpreted as BX + 
(1 — B)Y. Size-biased distributions occur in Chapter VII.) dq 


C.4. Special distributions 


A compendium of special probability distributions of frequent occurrence in ana- 
lytic combinatorics is provided by Figure C.1. 

A Bernoulli trial of parameter g is an event such that it has probability p of hav- 
ing value | (interpreted as “success”) and probability g of having value 0 (interpreted 
as “failure’”), with p + q = 1. Formally, this is the set Q = {0,1} endowed with 
the probability measure P(O) = g, P(1) = p. (By extension, we also refer to in- 
dependent experiments with finitely many possible outcomes as Bernoulli trials. In 
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Distribution Prob. (D), density (C) PGF(D), Char. f: (C) 
D Binomial (n, p) (‘) pu= py (q + pu)” 
1- 
D_ Geometric (q) (1 —q)q* i Z 
—qu 
k-1 1— m 
D_ Neg. binomial{m] (q) ae gk —q)™ = 
k 1—qu 
1 Ak log(1 — Au) 
D_ Log. series (A) ——_—__ — ——_—_ 
—log(l =H) k! log(1 — 4) 
A 
D_ Poisson (A) am eA(l—u) 
en /2 2 
C_ Gaussian or Normal, (0, 1) et /2 
V20 ; 
C_ Exponential e* a 
-i 
; sin(t/2) 
C. Uniform [—1/2, +1/2] (-1/2 <x < +1/2] GD) 


Figure C.1. A list of commonly encountered discrete (D) and continuous (C) prob- 
ability distributions: type, name, probabilities or density, probability generating func- 
tion or characteristic function. 


that sense, the model of words of some fixed length over a finite alphabet and non- 
uniform letter weights (or probabilities) belongs to the category of Bernoulli models; 
see Chapter III.) The binomial distribution of parameters n,q is the random vari- 
able that represents the number of successes in n independent Bernoulli trials. This is 
the probability distribution associated with the game of heads-and-tails. The geomet- 
ric distribution is the distribution of a random variable X that records the number of 
failures till the first success is encountered in a potentially arbitrarily long sequence 
of Bernoulli trials. The negative binomial distribution of index m (written N B[m]) 
and parameter g corresponds to the number of failures before m successes are en- 
countered. We have found in Chapter VII that it is systematically associated with the 
number of r—components in an unlabelled multiset schema F = 9Jt(G) whose com- 
position of singularities is of the exp—log type. The geometric distribution appears 
in several schemas related to sequences while the logarithmic series distribution is 
closely tied to cycles (Chapter V). indexlogarithmic-series distribution 

The Poisson distribution counts among the most important distributions of prob- 
ability theory. Its essential properties are recalled in Figure C.1. It occurs for instance 
in the distribution of singleton cycles and of r—cycles in a random permutation and 
more generally in labelled composition schemes (Chapter IX). 

In this book all probability distributions arising directly from combinatorics are a 
priori discrete as they are defined on finite sets—typically a certain subclass C, of a 
combinatorial class C. However, as the size n of the objects considered grows, these 
finite distributions usually approach a continuous limit. In this context, by far the most 
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important law is the Gaussian law also known as normal law, which is defined by its 
density and its distribution function: 


2 
ex /2 


1 ce 2 
, O@)=——] e& ay. 
J 2 @) J 20 a 2 


The corresponding Laplace transform is then evaluated by completing the square, 


Ms) = = [- ed 2489 dy.= es /? 
V2m J-co ; 


and, similarly, the characteristic function is #(t) = e' /2. The distribution of (5) is 
referred to as the standard normal distribution, (0, 1); if X is N(0, 1), the variable 
Y = “+X defines the normal distribution with mean and standard deviation o, 
denoted N(u, oc). 

Among other continuous distributions appearing in this book, we mention the 
theta distributions associated with the height of trees and Dyck paths (Chapter V) and 
the stable laws, which surface in Chapter IX. 


(5) g(x) = 


C.5. Convergence in law 


The central notion, which is of the greatest interest for analytic combinatorics, is 
the notion of convergence in law, also known as weak convergence. 


Definition C.1. Let F,, be a family of distribution functions. The F, are said to con- 
verge weakly to a distribution function F if pointwise there holds 


(6) lim F, (x) = F(x), 


at every continuity point x of F. This is expressed by writing F, = F as well as 
Xn = X, if X,, X are random variables corresponding to F,, F. We say that Xp 
converges in distribution or converges in law fo X. 


This definition has the merit of covering discrete and continuous distributions 
alike. For discrete distributions supported by Z, an equivalent form of (6) is 
limy Fn(k) = F(k) for each k € Z; for continuous distributions, Equation (6) just 
means that lim, F,(x) = F(x) for all x € R. Although in all generality anything can 
tend to anything else, due to the finite nature of combinatorics, we only need in this 
book the convergences 


Discrete => Discrete, Discrete => Continuous (after standardization). 


Three major tools can be used to establish convergence in law: characteristic 
functions, Laplace transforms, and moment convergence theorems. 


Characteristic functions and limit laws. Properties of random variables are re- 
flected by probabilities of characteristic functions, in accordance with general princi- 
ples of Fourier analysis—Figure C.2 offers an apergu. Most important for us is the 
Continuity Theorem for characteristic functions due to Lévy and used extensively in 
Chapter IX, starting on p. 639, through the Quasi-powers Theorem of p. 645. 
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Characteristic function (¢(t)) distribution function (F (x)) 
$0) =1 F(—oo) = 0, F(+00) = 1 
|P(to)| = 1 for some to 4 0 Lattice distribution, span x 
P(t) = 1+int+o(t) K(X) =U <co 
t0 : 
t 
d(t) = 1+int —v— +o(t?) E(X2) =v <0 
t>0 2 
2 d 
log p(t) =—5 x£N@,1) 
o(t) > Oast > co X is continuous 
#(t) integrable (is in £1) X is absolutely continuous 
Ls feet ee 
density is w(x) = ~| eX b(t) dt 

2m J—oo 
A(s) := o(—is) exists ina < R(s) < # | Exponential tails 
limp _ 599 i Hai |A(t)|2 dt equals >"; (p;)?; the p; are the jumps 
Gn(t) > P(t) (point conv.) F, => F (weak conv.) 

Xn => X (conv. in distribution) 

Gn “close” to d Fy, “close” to F (Berry—Esseen) 


Figure C.2. The correspondence between properties of the distribution function (F’) 
of a random variable (X) and properties of its characteristic function (¢). 


Theorem C.1 (Continuity theorem for characteristic functions). Let Y, Y, be random 
variables with characteristic functions $, gn. A necessary and sufficient condition for 
the weak convergence Y, = Y is that ¢n(t) > (t) for each t. 


For a proof, see [68, §26]. What is notable is that the theorem provides a nec- 
essary and sufficient condition. In addition, the Berry—Esseen inequalities stated in 
Chapter IX, p. 641, lie at the origin of precise speed of convergence estimates to 
asymptotic limits. 

Laplace transforms and limit laws. The continuity theorem for Laplace trans- 
forms is stated in Chapter IX, p. 639. In principle, it is of a more restricted scope 
than Theorem C.1 since Laplace transforms need not exist. Also, error bounds de- 
rived from Laplace transform can be exponentially worse than those resulting from 
Berry—Esseen inequalities [557]. For these reasons, the rdle of Laplace transforms in 
this book is mostly confined to large deviation estimates (Section IX. 10, p. 699). 

The method of moments. For the purpose of establishing limit laws in combi- 
natorics, it is may be convenient (sometimes even necessary) to access distributions 
by moments. One then attempts to deduce convergence of distributions from conver- 
gence of moments. This approach requires conditions under which a distribution is 
uniquely characterized by its moments—finding these is known as the moment prob- 
lem in analysis. A lucid discussion is offered by Billingsley in [68, $30], which we 
follow. 
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A distribution function F(x), with x € R, is characterized by its moments if the 
sequence of real numbers 


n= fxd FO, ran |e ey 
R 


uniquely determines F (that is: [x*dF = f[ x*dG for all k implies F = G). The 
following basic conditions are known to be sufficient for such a property to hold: (i) F 
has finite support; (7) the exponential generating function of (x) is analytic at 0, that 
is, for some R > 0, one has 

Rk 
(7) Hk > 0, k> ©. 


(The first case is proved by appealing to Weierstrass’ theorem to the effect that poly- 
nomials are dense among continuous functions over a finite interval with respect to 
the uniform norm; the second case results from the continuity theorem of Laplace 
transforms, which are none other than exponential generating functions of moments.) 
Clearly, the uniform distribution over [0,1], the exponential distribution, and the 
Gaussian distribution are characterized by their moments. 

Equation (7) expresses the fact that a distribution is characterized by its moments 
provided they do not grow too fast, which indicates that its tails decay sufficiently 
rapidly. Other useful sufficient conditions for F(x) to be characterized by moments 
are [157, XIV.2]: 


CO 
Carleman : De, Te ae = +00 (support(F’) Cc R) 
Lei 
(8) —: Yaz! = +400 (support(F) C Reo) 
k=0 
; oe dx ‘ 
Krein : log f@))5 5 = CO (F’(x) = f(x)). 
—~oo +x 


One has the following theorem. 
Theorem C.2 (Moment Convergence Theorem). Let F be determined by its moments 
and assume that a sequence of distribution functions Fy,(x), x € R satisfies for each 
k=0,1,2..., 
lim | x*dF,(x) = x* dF(x). 
noo R R 


Then weak convergence holds: F, => F. 


For a proof, see [68, $30]. In this book, moment methods are used to validate the 
moment pumping method expounded in Chapter VII, p. 532. 
> C.7. The log-normal distribution. As its name indicates, this is the distribution of the ex- 


ponential of a standard normal, with density f(x) = e—(logx)’/2 /(xV2z), for x > 0. The 
distribution with density f(x)(1 + sin(2z log x)) has the same moments (Stieltjes, 1895). < 
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a..b (integer interval), 17 

6 (derivative), 87 
(expectation), 113, 728, 772 
S (imaginary part), 230 

lg (binary logarithm), 308 

m (analytic mean), 645 

O (asymptotic notation), 722 
o (asymptotic notation), 722 
P (probability), 113, 157 

R (resultant notation), 739 
Reonv (radius of convergence), 230 
RK (real part), 230 

Res (residue operator), 233 

v (analytic variance), 645 

V (variance), 728 

A-—domain, see Delta—domain 
© (asymptotic notation), 723 
© (pointing), 86 

o (standard deviation), 728 

Q. (asymptotic notation), 723 
[-] (nearest integer function), 43, 260 
[z’"] (coefficient extractor), 19 
[-]] dverson’s notation), 58 

> (combinatorial isomorphism), 19 
= (numerically close), 7 

> (much larger), 566 

< (much smaller), 566 

= (roughly equal), 50 

~ (asymptotic notation), 722 
>< (exponential order), 243 

¢ (contour integral), 549 

x (labelled product), 101 

+, see disjoint union 

(-) (strip of C), 763 

o (substitution), 87, 136 


Cyc (cycle construction), 26, 103 
MSET (multiset construction), 26 
PSET (powerset construction), 26 
SEQ (sequence construction), 25, 102 
SET (set construction), 102 

Rg (Q-restricted construction), 30 


Abel identity, 733 

Abel—Plana summation, 238 
adjacency matrix (of graph), 336 
admissibility (of function), 564-572 
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admissible construction, 22, 100 
Airy area distribution, 365, 534, 706 
Airy function, 534, 598, 606, 707, 714 
Airy map distribution, 713-714 
alcohol, 284, 477-478 
algebraic curve, 495 
algebraic function, 482-518, 539 
asymptotics, 493-518 
branch, 495 
coefficient, 500-518 
elimination, 739-741 
exceptional set, 495-496 
Newton polygon, 498-500 
Puiseux expansion, 444, 498-500 
singularities, 495-518 
singularity perturbation, 68 1-684 
algebraic topology, 200 
algebraic—logarithmic singularity, 376, 393 
algorithm 
approximate counting, 313-315 
balanced tree, 91, 280 
binary adder, 308 
binary search tree, 203, 428-430, 685, 688 
digital tree (trie), 356, 693 
Floyd’s cycle detection, 465-466 
hashing, 111, 146, 178, 534, 600 
integer gcd, 664 
irreducible polynomials, 450 
Lempel—Ziv compression, 694 
paged trees, 688 
Pollard’s integer factoring, 466-467 
polynomial factorization, 449, 450 
polynomial gcd, 662-664 
shake and paint, 417 
TCP protocol, 315 
alignment, 119, 261, 296, 654 
alkanes, 477-479 
allocation, see balls-in-bins model 
alphabet, 49 
ambiguity 
context-free grammar, 82 
regular expression, 316, 734 
analytic continuation, 239 
analytic depoissonization, 572-574, 694 
analytic function, 230-238 
equivalent definitions, 741-743 
composition, 411-417 
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differentiation, 418-422, 741-743 
Hadamard product, 422-427 
integration, 418-422, 742-743 
inversion, 249, 275-280, 402-407 
iteration, 280-283 
Lindel6f integrals, 237, 409 
animal (lattice), 80-82, 396 
aperiodic function, system, see periodicity con- 
ditions 
apparent singularity, see singularity, apparent 
approximate counting, 313-315 
area (of Dyck path), 330, 533-534, 706 
argument principle, 270 
arithmetical functions, 721 
arithmetical semigroups, 91, 673 
arrangement, 112, 113 
asymptotic 
algebraic, 518 
expansion, 724 
notations, 722-725 
scale, 724-725 
atom, 24, 98 
autocorrelation (in words), 60, 271, 659 
automaton, see finite automaton 
average, see expectation 


balanced tree, see tree 
ballot problem, 68, 76 
balls-in-bins model, 113, 177-178 
capacity, 598-600 
Poisson law, 177 
Bell numbers, 109 
asymptotics, 560-562, 762 
Bell polynomials, 188 
Bernoulli number, 747 
Bernoulli numbers, 268, 726-727, 766 
Bernoulli trial, 191, 307, 774 
Berry—Esseen inequalities, 624-625, 641, 777 
Bessel function, 46, 332, 534, 607, 661, 753 
Beta function (B), 384, 524, 601, 747 
BGEF, see bivariate generating function 
bibliometry, 45 
bijective equivalence (=), 19 
binary decision tree (BDT), 78 
binary search tree (BST), 203, 428-430, 685, 
688 
binary tree, 738 
binomial coefficient, 100 
asymptotics, 380-385 
central approximation, 160, 328, 588, 642, 
761-762 
sum of powers, 761-762 
binomial convolution, 100 
binomial distribution, 627, 642, 775 
birth and death process, 319 
birth process, 312 
birthday paradox, 114-119, 192, 416 
bivariate generating function (BGF), 157 
Boltzmann model, 280, 566, 701 
boolean function, 70, 77-78, 487-488 


bootstrapping, 309 

bordering condition (permutation), 202 
Borges’s Theorem, 61-62, 680, 683-684 
Borges, Jorge Luis, 61 

boson, 532 

boxed product, 139-142 

branch (of curve), 495 

branch point (analytic function), 230, 277 
branching processes, 196-198 

bridge, 707 

bridge (lattice path), 77, 506-513, 636 
Brownian motion, 185, 360, 413, 534, 706 
Biirmann inversion, see Lagrange inversion 


canonicalization, 87 
cartesian product construction (x), 23 
Catalan numbers (C;,), 17, 34-36, 38, 67, 73- 
78, 738 
asymptotics, 7, 37-39, 383 
generating function, 35 
Catalan sum., 417 
Catalan tree, 35, 173, 738 
Cauchy’s residue theorem, 234 
Cauchy—Riemann equations, 742 
Cayley tree, 127-129, 179 
Cayley tree function, see Tree function (T) 
Central Limit Theorem (CLT), 593, 642-643, 
696 
centring (random variable), 773 
characteristic function (probability), 639, 772— 
7714 
Chebyshev inequalities, 161, 729 
Chebyshev polynomial, 327 
chessboard, 373 
circuit (in graph), 336, 346 
circular graph, 99 
class (of combinatorial structures), 16 
labelled, 95-149 
cluster, 209, 212 
coalescence of saddle-point 
with other saddle-point, 606 
with roots, 589 
with singularity, 590-591 
code (words), 62 
coding theory, 38, 53, 62, 246 
coefficient extractor ([z’]), 19 
coin fountain, 331, 662 
combination, 52 
combinatorial 
class, 16, 96 
isomorphism (=), 19 
parameter, 151-219 
sums, 415-417 
combinatorial chemistry, 443, 474-479 
combinatorial identities, 747-753 
combinatorial probability, 727-729 
combinatorial schema, see schema 
complete generating function, 186-198 
complex differentiability, 231 
complex dynamics, 280, 535 


complexity theory, 77 
composition (of integer), 39-49 
Carlitz type, 201, 206, 263, 666 
complete GF, 188 
cyclic (wheel), 47 
largest summand, 169, 298, 300 
local constraints, 199-200, 263 
number of summands, 44, 167-168, 654 
prime summands, 43, 298-300, 654 
profile, 169, 296 
r—parts, 168 
restricted summands, 297-300 
composition schema, 411-417, 628, 703 
critical, 412, 416-417, 707-714 
subcritical, 629, 634 
supercritical, 414-416, 650-655 
computable numbers, 251 
computer algebra, see symbolic manipulation 
concentration (of probability distribution), 161— 
163 
conformal map, 231 
conjugacy principle (paths), 75 
connection problem, 470-472, 483-505, 521, 
525, 583 
constructible class, 250-255 
construction 
cartesian product (x), 23 
cycle (CYC), 26, 165, 729-730 
labelled, 103, 174 
disjoint union (+), 25 
implicit, 88-91 
labelled product (x), 100-102 
multiset (MSET), 26, 165 
pointing(@), 86-88, 198 
powerset (PSET), 26, 165, 174 
sequence (SEQ), 25, 165 
labelled, 102, 174 
set (SET), 102 
substitution (0), 86-88, 198-201 
context-free 
asymptotics, 440, 482-484 
language, 82-83, 482 
specification, 78-83, 482-488 
continuant polynomial, 321 
continuation (analytic), 239 
continued fraction, 195, 216, 283, 318-336, 663 
continuity theorems (probability), 623-627, 
639-641, 776-777 
continuous random variable, 638-644, 771 
contour integral (f), 549 
convergence 
in law, 620-623, 638-639 
speed (probability), 624-625, 641 
convexity (of GFs), 280, 550 
correlation, see autocorrelation 
coupon collector problem, 114-119, 192 
cover time (walk), 363 
covering (of interval), 27 
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critical composition schema, see composition 
schema 
critical point, 607 
cumulant (of random variable), 647, 774 
cumulated value (of parameter), 159 
cumulative distribution function, see distribu- 
tion function 
cumulative generating function, 159 
cycle construction (CYC), 26, 165, 729-730 
labelled, 103, 174 
undirected, labelled, 133 
cycle lemma (paths), 75 
cyclic permutation, 99 


A-—domain, 389, 398 

D-finite functions, see holonomic functions 

Daffodil Lemma, 266 

Darboux’s method, 436 

data compression, 274, 694 

data mining, 315, 417 

de Bruijn graph, 354-355 

Dedekind y function, 577 

degree (of tree node), 737 

Delta—domain (A), 389, 398 

density (random variable), 771 

denumerant, 43, 257-258 

dependency graph, 33, 250, 340, 483 

depoissonization, 572-574 

derangement, 122, 207, 261, 368, 448, 671, 760 

derivative (0), 87 

devil’s staircase, 352-353 

dice games, 587 

Dickman function, 675 

difference equation, see g—calculus 

differential equations, 518-532, 581-585, 684— 
690, 748-753 

differential field, 522 

differentiation (singular), 418-422 

digital tree (trie), 356, 693 

digraph, see graph 

dilogarithm, 238, 410, 749-750 

dimensioning heuristic (saddle point), 554, 555, 
566 

diophantine inequalities (linear), 46 

directed graph, 336 

Dirichlet generating function (DGF), 664, 721, 
763 

disc of convergence (series), 230, 726 

discrete random variable, 620-628, 771 

discriminant (of polynomial), 495, 741 

disjoint union construction (+), 25, 100 

distribution, see probability distribution 

distribution function (random variable), 621, 
638, 641, 771 

divergent series, 89, 138, 731 

DLW Theorem, see Drmota—Lalley—Woods 
Theorem 

dominant singularity, 242 

double exponential distribution, 118, 308 

Drmota—Lalley—Woods Theorem, 443, 482-493 
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drunkard problem, 90, 425-427 

Dyck path, see also excursion, 77, 319, 511 
area, 330, 533-534, 706-707 
height, 326-330 
initial ascents, 635 

dynamical system, 318, 664, 716 


EGF, see exponential generating function 
Ehrenfest urn model, 118, 336, 530 
eigenvalue, see matrix 
EIS (Sloane’s Encyclopedia), 18 
Eisenstein’s lemma (algebraic functions), 505 
elimination (algebraic function), 739-741 
elliptic function, 330, 531 
entire function, 243 
entropy, 587 
error function (erf), 638 
Euclid’s algorithm, see greatest common divisor 
(ged) 
Euler numbers, 144, 268-269 
Euler’s constant (y ), 117, 726, 746, 747 
Euler—Maclaurin summation, 238, 268, 726— 
727, 766 
Eulerian numbers, 210, 658, 697-698, 702 
Eulerian tour (in graph), 354 
exceedances (in permutations), 368 
exceptional set (algebraic function), 495-496 
excursion (lattice path), 77, 319, 506-513 
exp—log schema, 441-442, 445-452, 670-676 
exp—log transformation, 29, 85 
expectation (or mean, average), E, 113, 158, 
728, 772 
exponential families (of functions), 197, 701 
exponential generating function (EGF) 
definition, 97 
multivariate, 156 
product, 100 
exponential growth formula, 243-249 
exponential order (p<), 243 
exponential—polynomial, 255, 290-293, 319- 
326 


Faa di Bruno’s formula, 188 

factorial moment, 158, 728 

factorial, falling, 520, 751 

Ferrers diagram, 39 

Fibonacci numbers (F,,), 42, 59, 256, 363 
Fibonacci polynomial, 327 

finite automaton, 56, 339-356 

finite field, 90 

finite language, 64 

finite state model, 350, 358-367 
forest (of trees), 68, 128, 737 

formal language, see language 
formal power series, see power series 
formal topology (power series), 731 
four-colour theorem, 513 

Fourier transform, 639, 772 

fractals, 282 

fragmented permutation, 125 
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asymptotics, 247, 562-563 
free group, 206 
free tree, see tree, unrooted 
function (of complex variable) 
analytic, 230-238 
differentiable, 231 
entire, 231, 243 
holomorphic, 231 
meromorphic, 233 
functional equation, 33, 275-285 
Dedekind y function, 577 
difference equation, see g—calculus 
elliptic theta function, 330 
Gamma function, 744 
kernel method, 508 
quadratic method, 515 
zeta function, 747 
functional graph, 129-132, 480, 673 
Fundamental Theorem of Algebra, 270, 546 


Galton—Watson process, 197 
gambler ruin sequence, 76 
gamma constant (y ), see Euler’s constant 
Gamma function (T), 378, 743-747 
Gaussian binomial, 45 
Gaussian distribution, 593-594, 638, 776 
Gaussian integral, 744 
general tree, 738 
generating function 
algebraic, see also algebraic function, 518 
complete, 186-198 
exponential, 95-149 
holonomic, see holonomic functions 
horizontal, 153 
multivariate, 151-219 
ordinary, 15 
rational, see rational function 
vertical, 153 
geometric distribution, 775 
Gessel’s calculus, 752-753 
GF, see generating function 
golden ratio (g), 42, 91 
graph 
acyclic, 132, 406 
adjacency matrix, 336 
aperiodic, 341 
bipartite, 138 
circuit, 336, 346 
circular, 99 
colouring, 513 
connected, 138-139 
de Bruijn, 354-355 
directed, 336 
enumeration, 105-106 
excess, 133, 406 
functional, 129-132, 480 
labelled, 96-97, 105-106, 132-136 
map, 513-518 
non-crossing, 485-487, 502-503 
path, 336-356 


periodic, 341 
planar, 517 
random, 134-136 
regular, 133, 189, 379, 395-396, 449, 583— 
585, 671, 752 
spanning tree, 339 
strongly connected, 341 
unicyclic, 133 
unlabelled, 105-106 
greatest common divisor (gcd), 662-664 
Green’s formula, 742 
Grobner basis, 80, 739 
group 
free, 206 
symmetric, 139 


Hadamard product, 303, 422-427, 748 

Hamlet, 54 

Hankel contour, 382, 745 

Hardy—Ramanujan expansion, 579 

harmonic function, 742 

harmonic number (H;,), 117, 161, 389, 724 
asymptotics, 723-724, 726, 766 
generating function, 160 

harmonic sum, 763 

Hartogs’ Theorem, 767 

hashing algorithm, 111, 146, 178, 600 

Hayman admissibility, 564-572 

heap of pieces, 81, 308 

Heaviside function, 763 

height of tree, see tree, height 

Hermite polynomial, 334 

hidden pattern, 54, 315-318 

hierarchy, 128, 280, 472-474, 479 

Hipparchus, 69 

histograms, 157 

holomorphic functions, 231 

holonomic functions, 445, 494, 518, 581-585, 

747-153 

homotopy (of paths), 233 

horizontal generating function, 153 

horse kicks, 627 

hypergeometric function, 423, 525, 750-751 
basic, 315 


implicit construction, 88-91, 137-139, 203-206 

Implicit Function Theorem, 753-755 

implicit-function schema, 467-475 

inclusion—exclusion, 206-214, 367-373 

increasing tree, 143-146, 202-203, 526-528, 
684-685 

Indo-European languages, 473 

inheritance (of parameters), 163, 174 

integer composition, see composition (of inte- 
ger) 

integer partition, see partition (of integer) 

integration (singular), 418-422 

interconnection network, 333 

inverse-function schema, 452-467 

inversion 
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analytic, 275 
inversion (analytic), 249, 402-407 
inversion table (permutation), 146 
involution (permutation), 122, 333, 558-560, 

691-692 

irregular singularity (ODE), 519, 581-585 
isomorphism (combinatorial, ~), 19 
iteration (of analytic function), 280-283 
iterative specification, 31-34, 250-255 
Iverson’s notation ([[-]]), 58 


Jacobi trace formula, 339 
Jacobian matrix, determinant, 483, 491, 755 


kangaroo hops, 373 

kernel method (functional equation), 508 

kings, 373 

kitten, 517 

Knuth-Ramanujan function, see Ramanujan’s 
Q-function 


labelled class, object, 95-149, 174-181 
labelled construction, 100-106 
labelled product (x), 101 
Lagrange inversion, 66-70, 126, 194, 732-733 
Lambert W-function, 128 
language, 733 
context-free, 82-83, 482 
formal, 49 
regular, 373, 733-735 
Laplace transform, 639, 750, 772-774 
Laplace’s method, 601, 755-762 
for sums, 761-762 
Laplacian, 742 
of graph, 339 
large deviations, 587, 699-703 
large powers, 585-594 
largest components, 300 
Latin rectangle, 752 
lattice path, 76-77, 318-336, 506-513 
decompositions, 320 
initial ascents, 635-637 
lattice points, 49, 589 
Laurent series, 507 
law of large numbers, 158, 162, 728 
law of small numbers, 627 
leader, 103, 136, 141, 142 
leaf (of tree), 182, 737 
Lebesgue measure, integral, 770 
letter (of alphabet), 49 
light bulb, 655 
limit law, 611-718, 776-778 
Lindel6f integrals, 237, 409 
linear fractional transformation, 323 
Liouville’s theorem, 237 
local limit law, 593, 615, 694-699 
localization (of zeros and poles), 269 
logarithm, binary (1g), 308 
logarithmic-series distribution, 297 
logic (first-order), 467 
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logistic map, 536 

longest run (in word), 308-312 
loop (in complex region), 233 
Lukasiewicz codes, 75, 511 
Lyndon words, 85 


MacMahon’s Master Theorem, 338 
magic duality, 238 
majorant series, 250, 753 
map, 414, 513-518, 713-714 
mapping, 129-132, 462-467, 708, 733 
connected components, 129-136, 449, 671 
idempotent, 571 
regressive, 145 
mapping pattern, see functional graph 
marking variable, 19, 164, 167 
Markov chain, 56, 339, 666 
Markov—Chebyshev inequalities, 161, 729 
Master Theorem (of MacMahon), 338 
matrix 
aperiodic, 341 
irreducible, 341 
non-negative, 342 
Perron—Frobenius theory, 340-342, 345 
positive, 342 
spectrum, 290 
stochastic, 339, 352 
trace, 339 
transfer, 358-367, 664, 666 
tridiagonal, 367 
matrix integrals, 517 
Matrix Tree Theorem, 339 
Maximum Modulus Principle, 545 
mean, see expectation 
meander (lattice path), 77, 506-513, 637 
meander (topology), 525 
measure theory, 769-771 
Meinardus’ method (integer partitions), 578— 
580 
Mellin transform, 311, 329, 409, 537, 576, 664, 
762-767 
ménage problem, 368 
meromorphic function, 233 
coefficient asymptotics, 289 
singularity perturbation, 650-666 
MGF, see multivariate generating function 
mobile (tree), 454 
Mobius function (Ww), 721 
Mobius inversion, 89, 722 
model theory, 467 
modular form, 331, 577 
moment generating function, see Laplace trans- 
form 
moment inequalities, 161-163, 729 
moment method, 318, 777-778 
moment pumping, 532-535 
moments (of random variable), 158, 727, 772 
monkey saddle, 542, 545, 600-606 
monodromy, 498 
Morera’s Theorem, 743 
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Motzkin numbers, 68, 77, 81, 88 

asymptotics, 396, 502, 589 
Motzkin path, 77, 319, 326, 330, 511 
multi-index convention, 165, 767 
multinomial coefficient, 100, 187 
multiset construction (MSET), 26, 165 
multivariate generating function (MGF), 151-— 

219 


naming convention, 19, 98 

Narayana numbers, 182 

natural boundary, 249 

nearest integer function ([-]), 43, 260 

necklace, 18, 64 

negative binomial distribution, 451, 621, 627, 
7715 

Neptune, 339 

nested sequences, 290, 291, 318-336 

network, 333 

neutral object, 24, 98 

Newton polygon, 498-500 

Newton’s binomial expansion, 35 

Newton—Puiseux expansion, see Puiseux expan- 

sion 

Newton-Raphson iteration, 88 

nicotine, 21 

non-crossing configuration, 485-487, 502-503 

non-plane tree, 71-72, 127 

non-recursive specification, see iterative specifi- 
cation 

Norlund-Rice integrals, 238 

normal distribution, see Gaussian distribution 

normalization (of random variable), see stan- 
dardization 

numerology, 318 


O (asymptotic notation), 722 

o (asymptotic notation), 722 

ODE (ordinary differential equation), see differ- 
ential equations 

OGF, see ordinary generating function 

order constraints (in constructions), 139-146, 
201-203 

ordinary generating function (OGF), 19 

ordinary point (analytic function), 543 

orthogonal polynomials, 323, 332 

oscillations (of coefficients), 264, 283, 384 

outdegree, see degree (of tree node) 


P-recurrence, 748-749 
Painlevé equation, 532, 598 
pairing (permutation), 122 
parallelogram polyomino, 660-662 
parameter (combinatorial), 151-219 
cumulated value, 159 
inherited, 163-165 
recursive, 181-185 
parenthesis system, 77 
parking, 146, 534 
parse tree, 82 


partially commutative monoid, 307-308 
partition (of integer), 39-49 
asymptotics, 248, 574-581 
denumerant, 43, 257-258 
distinct summands, 579 
Durfee square, 45 
Ferrers diagram, 39 
Hardy—Ramanujan—Rademacher expansion, 
579 
largest summand, 44 
Meinardus’ method, 578-580 
number of summands, 44, 171, 581, 666 
plane, 580 
prime summands, 580 
profile, 171 
r—parts, 172 
partition of set, see set partition 
path (in graph), 336 
path (in complex region), 233 
path length, see tree 
patterns 
in permutations, 211, 689 
in trees, 213-214, 680-681 
in words, 54-56, 58-62, 211, 271-274, 315- 
318, 659-660, 666 
pentagonal numbers, 49 
periodicity conditions 
coefficients, 264, 266, 302 
Daffodil Lemma, 266 
generating function, 294, 302 
graph, 341 
linear system, 341 
polynomial system, 483 
permutation, 17, 98, 119-124 
alternating, 143-144, 269 
ascending runs, 209-211, 658-659, 697-698 
avoiding exceedances, 368 
bordering condition, 202 
cycles, see also Stirling numbers (1st kind), 
119-124, 155, 175-177, 448, 644-645, 
671 
cycles of length m, 625-627 
cyclic, 99 
derangement, 122, 207, 261, 368, 448, 671, 
760 
exceedances, 368 
fixed order, 569 
increasing subsequences, 596-598 
indecomposable, 89, 139 
inversion table, 146 
involution, 122, 248, 333, 558-560, 596, 
691-692 
local order types, 202-203 
longest cycle, 122, 569 
longest increasing subsequence, 211, 596— 
598, 716, 752-753 
ménage, 368 
pairing, 122 
pattern, 211, 689 
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profile, 175 
records, 140-141, 644-645 
rises, 209-211 
shortest cycle, 122, 261-262 
singletons, 622-623 
succession gap, 373 
tree decomposition, 143-144 
Perron—Frobenius theory, 340-342, 345 
perturbation theory, 11-12, 591, 612, 617-618, 
650-694, 703 
PGF, see probability generating function 
phase transition, 704-714 
diagram, 704 
phylogenetic trees, 129 
Picard approximants, 754 
Plana’s summation, 238 
planar graph, 517 
plane partition (of integer), 580 
plane tree, 65-70 
pointing construction (@), 86-88, 136-137, 198 
Poisson distribution, 176, 451, 572-574, 627, 
643, 775 
Poisson—Dirichlet process, 676 
poissonization, 572-574 
political (in)correctness, 146 
Polya operators, 34, 252, 447, 475-482 
Pélya theory, 83, 85-86 
Polya urn process, see urn model 
Polya—Carlson Theorem, 253 
Pélya—Redfield Theorem, 85 
polydisc, 767 
polylogarithm, 237, 408-411, 749-750 
polynomial 
primitive, 358 
polynomial (finite field), 90-91, 449-450, 662- 
664, 672-673 
polynomial system, 488, 494 
polyomino, 45, 201, 331, 363, 365-367, 535, 
660-662 
power series, 15, 19, 97, 153, 164, 187, 730-731 
convergence, 731 
divergent, 89, 138, 731 
formal topology, 731 
product, 731 
quasi-inverse, 731 
sum, 731 
powerset construction (PSET), 26, 165 
preferential arrangement numbers, 109 
preorder traversal (tree), 74 
prime number, 228, 721 
Prime Number Theorem, 91 
principal determination (function), 230 
Pringsheim’s theorem, 240 
prisoners, 124, 176 
probabilistic method, 729 
probability (P), 113, 157 
probability distribution 
Airy area, 365, 707 
Airy map, 713-714 
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arcsine law, 705 

Bernoulli, 775 

binomial, 627, 642, 775 

double exponential, 118, 308-311 
Gaussian, 593-594, 638, 776 
geometric, 775 

geometric—birth, 314 

logarithmic series, 296, 775 

negative binomial, 451, 621, 627, 775 
Poisson, 451, 572-574, 627, 643, 775 
Rayleigh, 116, 708 

stable laws, 413, 707-714 

theta function, 328, 360, 538 
Tracy—Widom, 598 

Zipf laws, 711 


probability generating function (PGF), 157, 


623, 728, 773 
probability space, 769 
profile (of objects), 169, 451-452 
pruned binary tree, 738 
psi function (y), 725, 746 


Puiseux expansion (algebraic function), 444, 


498-500 


q-calculus, 45, 49, 315, 331, 661 
quadratic method (functional equation), 515 
quadtree, 522-525, 687-688 
quasi-inverse, 34, 291, 731 
matrix, 349 
quasi-powers, 11, 586, 612, 644-690 
generalized, 690-694 
large deviations, 699-703 
local limit law, 694-699 
main theorem, 645-648 


Rabin—Scott Theorem, 57-59, 735 
radioactive decay, 627 
radius of convergence (series), 230, 243-244 
Radon—Nikodym Theorem, 771 
Ramanujan’s Q-function, 115, 130, 416-417 
random generation, 77, 300 
random matrix, 597, 674 
random number generator, 465 
random variable, 727, 769-778 
continuous, 638-644, 771 
density, 771 
discrete, 157, 620-628, 771 
random walk, see walk 
rational function, 236, 255-258, 269-271 
positive, 356, 357 
Rayleigh distribution, 116, 708 
record 
in permutation, 140-141 
in word, 189 
recurrence 
tree, 427-433 
recursion (semantics of), 33 
recursive parameter, 181-185 
recursive specification, 32-34 
region (of complex plane), 229 
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regular 
expression, 373, 733-735 
language, 300-308, 373, 733-735 
specification, 300-308 
regular graph, see graph, regular 
regular point (analytic function), 239 
regular singularity (ODE), 519-525 
relabelling, 100 
removable singularity, see singularity, apparent 
renewal process, 300, 655 
Res (residue operator), 233 
residue, 233-238 
Cauchy’s theorem, 234 
resultant (R), 80, 739-741 
Riccati differential equation, 689 
Rice integrals, see Norlund-Rice integrals 
Riemann surface, 239 
Rogers—Ramanujan identities, 331 
rotation correspondence (tree), 73 
Rouché’s theorem, 270 
round (children’s), 397 
RV, see random variable 


SA (amenable to singularity analysis), 401 
saddle-point 
analytic function, 543-546 
bounds, 246, 546-550, 586 
depoissonization, 572-574 
dimensioning heuristic, 554, 555, 566 
large powers, 585-594 
method, 541-608 
multiple, 545, 600-606 
perturbation, 690-694 
scaling (random variable), 773 
schema (combinatorial—analytic), see also com- 
position schema, context-free specifica- 
tion, exp-log schema, implicit-function 
schema, inverse function schema, nested 
sequences, regular specification, simple 
variety (of trees), supercritical sequence 
schema, 12, 170-171, 178-181, 289 
Schréder’s problems, 69, 129, 474 
section (of sequence), 302 
self-avoiding configurations, 363-365 
semantics of recursion, 33 
sequence construction (SEQ), 25, 165 
labelled, 102, 174 
series—parallel network, 69, 72 
set construction (SET), 102, 174 
set partition, see also Bell numbers, Stirling 
numbers (2nd kind), 62-64, 106-119, 179 
asymptotics, 247, 560-562 
block, 108 
largest block, 569 
number of blocks, 179, 594-596, 692-693 
several complex variables, 767-768 
shifting of the mean, 700, 701 
shuffle product, 306 
sieve formula, see inclusion—exclusion 
Simon Newcomb’s problem, 192-193 


simple variety (of trees), 66, 128, 194, 452 
singular expansion (function), 393 
singularity, 239-243 
algebraic—logarithmic, 376, 393 
apparent, 243, 743 
dominant, 242 
irregular (ODE), 581, 585 
perturbation, 650-690 
regular (ODE), 519-525 
removable, 243, 743 
singularity analysis, 375-438 
applications, 439-540 
perturbation, 650-690 
uniform expansions, 668-669 
singularity perturbation, 703-707 
size (of combinatorial object), 16, 96 
size-biased (probability), 461 
Skolem-Mahler-Lech Theorem, 266 
slicing, 199, 366, 508 
slow variation, 434 
Smirnov word, 204, 262, 312, 350 
society (combinatorial class), 571 
spacings, 52 
span (of sequence, GF), 266 
spanning tree, 339 
special functions, 747-753 
species, 30, 94, 137, 149 
specification, 33 
iterative, 31-34, 250-255, 280 
recursive, 32—34 
spectrum, see matrix 
speed of convergence (probability), 624-625, 
638-639 
squaring of the circle, 758 
stable laws, see probability distribution 
standard deviation, (a), 728 
standardization (random variable), 614, 638, 
773 
star-continuable function, 398 
statistical physics, 46, 81, 201, 362-363, 440, 
525, 704 
steepest descent, 544, 547, 607 
Stieltjes integral, 770-771 
Stirling numbers, 735-737 
cycle (1st kind), 121, 155, 644-645, 654, 698 
partition (2nd kind), 62-64, 109, 179, 653— 
654, 692-694 
Stirling’s approximation, 37, 407, 410, 555— 
558, 747, 760-761, 766 
Stokes phenomenon, 582-583 
string, see word 
strip ((-)), 763 
subcritical composition schema, see composi- 
tion schema 
subexponential factor, 243 
subsequence statistics, see hidden patterns, 
words 
substitution construction (0), 86-88, 136-137, 
198-201 
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supercritical composition schema, see composi- 
tion schema 
supercritical cycle, 414 
supercritical sequence, 293-300, 652-655 
supernecklace, 125 
supertree, 412-414, 503, 714 
support (of probability measure), 769 
support (of sequence, GF), 266 
surjection, 106-119, 296, 653-654 
asymptotics, 259 
complete GF, 188 
surjection numbers, 109, 268 
symbolic manipulation, 253 
symbolic method, 15, 22, 33, 92, 104 
symmetric functions, 189, 752-753 


Tauberian theory, 434, 572 
Taylor expansion, 201, 723, 726, 742 
theory of species, see species 
theta function, 328-330, 360, 538 
threshold phenomenon, 211 
tiling, 360-363, 665 
total variation distance (probability), 623 
totient function (g), 27, 721 
trace monoid, see partially commutative monoid 
trains, 253-255, 398 
transcendental function, 506 
transfer matrix, 358-367, 664-666 
transfer operator, 664 
transfer theorem, 389-392 
tree, 31, 64-72, 125-136, 737 
additive functional, 457-462 
balanced, 91, 280-283 
binary, see also Catalan numbers, 67, 738 
branching processes, 196-198 
Catalan, 35 
Cayley, see also Tree function (J), 127-129 
degree profile, 194, 459-460 
exponential bounds, 277-280 
forests, 68 
general, 31, 738 
height, 216, 327-330, 458-459, 535-538 
increasing, 143-146, 202-203, 526-528, 
684-685 
leaf, 182, 473, 678, 737 
level profile, 194-195, 458-459, 711-712 
Lukasiewicz codes, 75 
mobile, 454 
non-crossing, 485-487, 502-503 
non-plane, 71-72, 462, 475-482 
non-plane, labelled, 127 
parse tree, 82 
path length, 184-185, 195, 461, 534-535, 
706-707 
pattern, 213-214, 680-681 
plane, 65-70, 738 
plane, labelled, 126 
quadtree, 522-525, 687-688 
regular, 68 
root subtrees, 633 
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root-degree, 173, 179, 456-457, 632 
rooted, 737 
search, 203 
simple variety, 66, 128, 194, 404-407, 452- 
467, 589-590, 633, 683, 711-712 
supertree, 412-414, 503, 714 
t-ary, 68 
unary-binary, see also Motzkin numbers, 68, 
88, 396, 501 
unrooted, 132, 480-482 
valuated, 414 
width, 359-360, 666, 712 
tree concepts, 737-738 
Tree function (T), 127-128, 403-407 
tree recurrence, 427-433 
triangulation (of polygon), 17, 20, 35-36, 79 
tridiagonal matrix, 367 
trinomial numbers, 588 
trivial bound (integration), 547 
truncated exponential, 111 


unambiguous, see ambiguity 
unary—binary tree, see tree, unary—binary and 
Motzkin numbers 

undirected cycle construction (UCYC), 86, 133 
undirected sequence construction (USEQ), 86 
uniform expansions 

asymptotics, 725-726 

singularity analysis, 668-669, 676 
uniform probability measure, 727 
uniformization (algebraic function), 497 
universality, 7, 12, 440-443, 455, 606 
unlabelled structures, 163-174 
unrooted tree, see tree, unrooted 
urn (combinatorial class), 99 
urn model, 118, 336, 529-531 


Vallée’s identity, 30 

valley (saddle-point), 544 

variance (V), 728 

vertical generating function, 153 

Vitali’s theorem (analytic functions), 624 


w.h.p. (with high probability), 135, 162 
walk, 367 
birth type, 312-315 
cover time, 363 
devil’s staircase, 352-353 
in graphs, 336-356 
integer line, 319-324 
interval, 319-330 
lattice path, 76-77, 318-336, 506-513 
self-avoiding, 363-365 
Wallis integral, 747, 758 
weak convergence (probability distributions), 
621 
Weierstrass Preparation Theorem (WPT), 754— 
755 
wheel, 47 
width (of tree), 359-360, 666, 712 


INDEX 


winding number, 270 

word, 49-64, 111-119 
aperiodic, 85 
code, 62 
excluded patterns, 355 
language, 49, 733 
local constraints, 349 
longest run, 308-312 
pattern, 54-56, 58-62, 211, 271-274, 315- 

318, 659-660, 666 

record, 189 
runs, 51—54, 204 
Smirnov, 204, 262, 312, 350 


Young tableau, 752 


zeta function of graphs, 346 

zeta function, Riemann (¢), 228, 269, 408, 721, 
746-747, 752 

Zipf laws, 711 


